Tutorials and Reviews

International Journal of Bifurcation and Chaos, Vol. 13, No. 11 (2003) 3147–3233
 c World Scientific Publishing Company




                            TOWARD A THEORY OF CHAOS

                                                  A. SENGUPTA
                                      Department of Mechanical Engineering,
                                      Indian Institute of Technology Kanpur,
                                              Kanpur 208016, India
                                                  osegu@iitk.ac.in

                            Received February 23, 2001; Revised September 19, 2002


          This paper formulates a new approach to the study of chaos in discrete dynamical systems
          based on the notions of inverse ill-posed problems, set-valued mappings, generalized and multi-
          valued inverses, graphical convergence of a net of functions in an extended multifunction space
          [Sengupta & Ray, 2000], and the topological theory of convergence. Order, chaos and complexity
          are described as distinct components of this unified mathematical structure that can be viewed
          as an application of the theory of convergence in topological spaces to increasingly nonlinear
          mappings, with the boundary between order and complexity in the topology of graphical con-
          vergence being the region in (Multi(X)) that is susceptible to chaos. The paper uses results from
          the discretized spectral approximation in neutron transport theory [Sengupta, 1988, 1995] and
          concludes that the numerically exact results obtained by this approximation of the Case singular
          eigenfunction solution is due to the graphical convergence of the Poisson and conjugate Poisson
          kernels to the Dirac delta and the principal value multifunctions respectively. In (Multi(X)),
          the continuous spectrum is shown to reduce to a point spectrum, and we introduce a notion of
          latent chaotic states to interpret superposition over generalized eigenfunctions. Along with these
          latent states, spectral theory of nonlinear operators is used to conclude that nature supports
          complexity to attain efficiently a multiplicity of states that otherwise would remain unavailable
          to it.

          Keywords: Chaos; complexity; ill-posed problems; graphical convergence; topology; multifunc-
          tions.




Prologue                                                       of study of so-called “strongly ” nonlinear system.
1. Generally speaking, the analysis of chaos is ex-            . . . Linearity means that the rule that determines what
tremely difficult. While a general definition for chaos           a piece of a system is going to do next is not influ-
applicable to most cases of interest is still lacking,         enced by what it is doing now. More precisely this
mathematicians agree that for the special case of iter-        is intended in a differential or incremental sense: For
ation of transformations there are three common char-          a linear spring, the increase of its tension is propor-
acteristics of chaos:                                          tional to the increment whereby it is stretched, with
                                                               the ratio of these increments exactly independent of
1. Sensitive dependence on initial conditions,
                                                               how much it has already been stretched. Such a spring
2. Mixing,
3. Dense periodic points.                                      can be stretched arbitrarily far . . . . Accordingly no real
                                                               spring is linear. The mathematics of linear objects is
                    [Peitgen, Jurgens & Saupe, 1992]           particularly felicitous. As it happens, linear objects en-
2. The study of chaos is a part of a larger program            joy an identical, simple geometry. The simplicity of


                                                           3147
3148   A. Sengupta

this geometry always allows a relatively easy mental            5. One of the most striking aspects of physics
image to capture the essence of a problem, with the             is the simplicity of its laws. Maxwell’s equations,
technicality, growing with the number of parts, basi-           Schroedinger’s equations, and Hamilton mechanics
cally a detail. The historical prejudice against nonlinear      can each be expressed in a few lines. . . . Everything
problems is that no so simple nor universal geometry            is simple and neat except, of course, the world. Every
usually exists.                                                 place we look outside the physics classroom we see a
          Mitchell Feigenbaum’s Foreword (pp. 1–7)              world of amazing complexity. . . . So why, if the laws
                            in [Peitgen et al., 1992]           are so simple, is the world so complicated? To us com-
                                                                plexity means that we have structure with variations.
3. The objective of this symposium is to explore the            Thus a living organism is complicated because it has
impact of the emerging science of chaos on various dis-         many different working parts, each formed by varia-
ciplines and the broader implications for science and           tions in the working out of the same genetic coding.
society. The characteristic of chaos is its universality        Chaos is also found very frequently. In a chaotic world
and ubiquity. At this meeting, for example, we have             it is hard to predict which variation will arise in a given
scholars representing mathematics, physics, biology,            place and time. A complex world is interesting because
geophysics and geophysiology, astronomy, medicine,              it is highly structured. A chaotic world is interesting
psychology, meteorology, engineering, computer sci-             because we do not know what is coming next. Our
ence, economics and social sciences. 1 Having so many           world is both complex and chaotic. Nature can pro-
disciplines meeting together, of course, involves the           duce complex structures even in simple situations and
risk that we might not always speak the same lan-               obey simple laws even in complex situations.
guage, even if all of us have come to talk about
                                                                                         [Goldenfeld & Kadanoff, 1999]
“chaos”.
                                                                6. Where chaos begins, classical science stops. For as
    Opening address of Heitor Gurgulino de Souza,
                                                                long as the world has had physicists inquiring into
         Rector United Nations University, Tokyo
                                                                the laws of nature, it has suffered a special ignorance
                                 [de Souza, 1997]
                                                                about disorder in the atmosphere, in the turbulent sea,
4. The predominant approach (of how the different                in the fluctuations in the wildlife populations, in the
fields of science relate to one other ) is reductionist:         oscillations of the heart and the brain. But in the 1970s
Questions in physical chemistry can be understood               a few scientists began to find a way through disor-
in terms of atomic physics, cell biology in terms of            der. They were mathematicians, physicists, biologists,
how biomolecules work . . . . We have the best of rea-          chemists . . . (and) the insights that emerged led di-
sons for taking this reductionist approach: it works.           rectly into the natural world: the shapes of clouds,
But shortfalls in reductionism are increasingly appar-          the paths of lightning, the microscopic intertwining
ent (and) there is something to be gained from sup-             of blood vessels, the galactic clustering of stars. . . .
plementing the predominantly reductionist approach              Chaos breaks across the lines that separate scientific
with an integrative agenda. This special section on             disciplines, (and) has become a shorthand name for a
complex systems is an initial scan (where) we have              fast growing movement that is reshaping the fabric of
taken a “complex system” to be one whose properties             the scientific establishment.
are not fully explained by an understanding of its com-
                                                                                                             [Gleick, 1987]
ponent parts. Each Viewpoint author 2 was invited to
define “complex” as it applied to his or her discipline.         7. order (→) complexity (→) chaos.
                      [Gallagher & Appenzeller, 1999]                                                     [Waldrop, 1992]

1
  A partial listing of papers is as follows: Chaos and Politics: Application of Nonlinear Dynamics to Socio-Political issues;
Chaos in Society: Reflections on the Impact of Chaos Theory on Sociology; Chaos in Neural Networks; The Impact of Chaos
on Mathematics; The Impact of Chaos on Physics; The Impact of Chaos on Economic Theory; The Impact of Chaos on
Engineering; The impact of Chaos on Biology; Dynamical Disease: And The Impact of Nonlinear Dynamics and Chaos on
Cardiology and Medicine.
2
  The eight Viewpoint articles are titled: Simple Lessons from Complexity; Complexity in Chemistry; Complexity in Biolog-
ical Signaling Systems; Complexity and the Nervous System; Complexity, Pattern, and Evolutionary Trade-Offs in Animal
Aggregation; Complexity in Natural Landform Patterns; Complexity and Climate, and Complexity and the Economy.
Toward a Theory of Chaos   3149

8. Our conclusions based on these examples seem sim-           essary that we have a mathematically clear physi-
ple: At present chaos is a philosophical term, not a           cal understanding of these notions that are suppos-
rigorous mathematical term. It may be a subjective             edly reshaping our view of nature. This paper is an
notion illustrating the present day limitations of the         attempt to contribute to this goal. To make this
human intellect or it may describe an intrinsic prop-          account essentially self-contained we include here,
erty of nature such as the “randomness” of the se-             as far as this is practical, the basics of the back-
quence of prime numbers. Moreover, chaos may be                ground material needed to understand the paper in
undecidable in the sense of Godel in that no matter            the form of Tutorials and an extended Appendix.
what definition is given for chaos, there is some ex-                The paradigm of chaos of the kneading of the
ample of chaos which cannot be proven to be chaotic            dough is considered to provide an intuitive basis
from the definition.                                            of the mathematics of chaos [Peitgen et al., 1992],
                                 [Brown & Chua, 1996]          and one of our fundamental objectives here is to re-
                                                               count the mathematical framework of this process
9. My personal feeling is that the definition of a “frac-       in terms of the theory of ill-posed problems arising
tal” should be regarded in the same way as the biolo-          from non-injectivity [Sengupta, 1997], maximal ill-
gist regards the definition of “life”. There is no hard         posedness, and graphical convergence of functions
and fast definition, but just a list of properties char-        [Sengupta & Ray, 2000]. A natural mathematical
acteristic of a living thing . . . . Most living things have   formulation of the kneading of the dough in the
most of the characteristics on the list, though there          form of stretch-cut-and-paste and stretch-cut-and-
are living objects that are exceptions to each of them.        fold operations is in the ill-posed problem arising
In the same way, it seems best to regard a fractal as          from the increasing non-injectivity of the function
a set that has properties such as those listed below,          f modeling the kneading operation.
rather than to look for a precise definition which will
certainly exclude some interesting cases.
                                         [Falconer, 1990]
                                                               Begin Tutorial 1: Functions and
10. Dynamical systems are often said to exhibit chaos          Multifunctions
without a precise definition of what this means.
                                                               A relation, or correspondence, between two sets X
                                        [Robinson, 1999]       and Y , written M: X –→ Y , is basically a rule that
                                                                                       →
                                                               associates subsets of X to subsets of Y ; this is often
1. Introduction                                                expressed as (A, B) ∈ M where A ⊂ X and B ⊂ Y
The purpose of this paper is to present an unified,             and (A, B) is an ordered pair of sets. The domain
self-contained mathematical structure and physical                         def
understanding of the nature of chaos in a discrete                D(M) = {A ⊂ X : (∃Z ∈ M)(πX (Z) = A)}
dynamical system and to suggest a plausible expla-             and range
nation of why natural systems tend to be chaotic.                          def
The somewhat extensive quotations with which we                   R(M) = {B ⊂ Y : (∃Z ∈ M)(πY (Z) = B)}
begin above, bear testimony to both the increas-               of M are respectively the sets of X which under
ingly significant — and perhaps all-pervasive —                 M corresponds to sets in Y ; here πX and (πY )
role of nonlinearity in the world today as also our            are the projections of Z on X and Y , respectively.
imperfect state of understanding of its manifesta-             Equivalently, (D(M) = {x ∈ X : M(x) = ∅}) and
tions. The list of papers at both the UN Confer-               (R(M) = x∈D(M) M(x)). The inverse M− of M
ence [de Souza, 1997] and in Science [Gallagher &              is the relation
Appenzeller, 1999] is noteworthy if only to justify
                                                                           M− = {(B, A) : (A, B) ∈ M}
the observation of Gleick [1987] that “chaos seems
to be everywhere”. Even as everybody appears to                so that M− assigns A to B iff M assigns B to A.
be finding chaos and complexity in all likely and               In general, a relation may assign many elements in
unlikely places, and possibly because of it, it is nec-        its range to a single element from its domain; of
3150   A. Sengupta

                                          ¨                                                                
                                                                        ¡                                                        
                       £ ¤¢                                                             ¤
               ©                                                    £                                                               $ 
                                                                                                                                   # !
                                  ¥ ¦¢                      §                    ¦
                                                                ¥                                                                             
                                                                                                                               
                                                        §

                                         (a)
                                         (a)      (a)                                                     (b)
                                                                                                      (b) (b)


                   3                      4                 ( )'
                                                                                          6                        9                     @
               1                                                        
                              0               %         2 '                                                                         7         8
                                                                                              5

                                         (c)
                                         (c)      (c)                                                     (d)
                                                                                                      (d) (d)
Fig. 1. Functional and non-functional relations between two sets X and Y : while f and g are functional relations, M is not.
(a) f and g are both injective and surjective (i.e. they are bijective), (b) g is bijective but f is only injective and f −1 ({y2 }) := ∅,
(c) f is not 1:1, g is not onto, while (d) M is not a function but is a multifunction.


especial significance are functional relations f 3 that                      linear homogeneous differential equation with con-
can assign only a unique element in R(f ) to any                            stant coefficients of order n  1 has n linearly
element in D(f ). Figure 1 illustrates the distinc-                         independent solutions so that the operator D n of
tion between arbitrary and functional relations M                           D n (y) = 0 has a n-dimensional null space. Inverses
and f . This difference between functions (or maps)                          of non-injective, and in general non-bijective, func-
and multifunctions is basic to our development and                          tions will be denoted by f − . If f is not injective
should be fully understood. Functions can again be                          then
classified as injections (or 1:1) and surjections (or                                                                      def
                                                                                                     A ⊂ f − f (A) = sat(A)
onto). f : X → Y is said to be injective (or one-to-
one) if x1 = x2 ⇒ f (x1 ) = f (x2 ) for all x1 , x2 ∈ X,                    where sat(A) is the saturation of A ⊆ X induced by
while it is surjective (or onto) if Y = f (X). f is                         f ; if f is not surjective then
bijective if it is both 1:1 and onto.
                                                                                                   f f − (B) := B          f (X) ⊆ B.
     Associated with a function f : X → Y is its in-
verse f −1 : Y ⊇ R(f ) → X that exists on R(f ) iff                          If A = sat(A), then A is said to be saturated, and
f is injective. Thus when f is bijective, f −1 (y) :=                       B ⊆ R(f ) whenever f f − (B) = B. Thus for non-
{x ∈ X: y = f (x)} exists for every y ∈ Y ; infact f is                     injective f , f − f is not an identity on X just as
bijective iff f −1 ({y}) is a singleton for each y ∈ Y .                     f f − is not 1Y if f is not surjective. However the
Non-injective functions are not at all rare; if any-                        set of relations
thing, they are very common even for linear maps
and it would be perhaps safe to conjecture that                                                   f f − f = f,         f −f f − = f −               (1)
they are overwhelmingly predominant in the non-                             that is always true will be of basic significance in
linear world of nature. Thus for example, the simple                        this work. Following are some equivalent statements

3
  We do not distinguish between a relation and its graph although technically they are different objects. Thus although a
functional relation, strictly speaking, is the triple (X, f, Y ) written traditionally as f : X → Y , we use it synonymously with
the graph f itself. Parenthetically, the word functional in this paper is not necessarily employed for a scalar-valued function,
but is used in a wider sense to distinguish between a function and an arbitrary relation (that is a multifunction). Formally,
whereas an arbitrary relation from X to Y is a subset of X × Y , a functional relation must satisfy an additional restriction
that requires y1 = y2 whenever (x, y1 ) ∈ f and (x, y2 ) ∈ f . In this subset notation, (x, y) ∈ f ⇔ y = f (x).
Toward a Theory of Chaos    3151

on the injectivity and surjectivity of functions f :            set of X under ∼, denoted by X/ ∼:= {[x]: x ∈ X},
X →Y.                                                           has the equivalence classes [x] as its elements; thus
(Injec) f is 1:1 ⇔ there is a function f L : Y → X              [x] plays a dual role either as subsets of X or as ele-
called the left inverse of f , such that f L f = 1X ⇔           ments of X/ ∼. The rule x → [x] defines a surjective
A = f − f (A) for all subsets A of X ⇔ f ( Ai ) =               function Q: X → X/ ∼ known as the quotient map.
  f (Ai ).                                                      Example 1.1. Let
(Surjec) f is onto ⇔ there is a function f R : Y → X
called the right inverse of f , such that f f R = 1Y ⇔                    S 1 = {(x, y) ∈ R2 ) : x2 + y 2 = 1}
B = f f − (B) for all subsets B of Y .
                                                                be the unit circle in R2 . Consider X = [0, 1] as a
     As we are primarily concerned with non-                    subspace of R, define a map
injectivity of functions, saturated sets generated by
equivalence classes of f will play a significant role                q : X → S 1,      s → (cos 2πs, sin 2πs), s ∈ X ,
in our discussions. A relation E on a set X is said             from R to R2 , and let ∼ be the equivalence relation
to be an equivalence relation if it is 4                        on X
(ER1) Reflexive: (∀x ∈ X)(xEx).
                                                                  s ∼ t ⇔ (s = t) ∨ (s = 0, t = 1) ∨ (s = 1, t = 0) .
(ER2) Symmetric: (∀x, y ∈ X)(xEy ⇒ yEx).
(ER3) Transitive: (∀x, y, z ∈ X)(xEy ∧ yEz ⇒                    If we bend X around till its ends touch, the resulting
      xEz).                                                     circle represents the quotient set Y = X/ ∼ whose
Equivalence relations group together unequal ele-               points are equivalent under ∼ as follows
ments x1 = x2 of a set as equivalent according to                 [0] = {0, 1} = [1],      [s] = {s} for all s ∈ (0, 1) .
the requirements of the relation. This is expressed
as x1 ∼ x2 (mod E) and will be represented here by              Thus q is bijective for s ∈ (0, 1) but two-to-one for
the shorthand notation x1 ∼E x2 , or even simply                the special values s = 0 and 1, so that for s, t ∈ X,
as x1 ∼ x2 if the specification of E is not essential.
                                                                                    s ∼ t ⇔ q(s) = q(t) .
Thus for a non-injective map if f (x1 ) = f (x2 ) for
x1 = x2 , then x1 and x2 can be considered to be                This yields a bijection h: X/ ∼ → S 1 such that
equivalent to each other since they map onto the
same point under f ; thus x1 ∼f x2 ⇔ f (x1 ) =                                           q =h◦Q
f (x2 ) defines the equivalence relation ∼ f induced             defines the quotient map Q: X → X/ ∼ by h([s]) =
by the map f . Given an equivalence relation ∼ on               q(s) for all s ∈ [0, 1]. The situation is illustrated by
a set X and an element x ∈ X the subset                         the commutative diagram of Fig. 2 that appears as
                    def
                [x] = {y ∈ X : y ∼ x}                           an integral component in a different and more gen-
is called the equivalence class of x; thus x ∼ y ⇔              eral context in Sec. 2. It is to be noted that com-
[x] = [y]. In particular, equivalence classes gener-            mutativity of the diagram implies that if a given
ated by f : X → Y , [x]f = {xα ∈ X : f (xα ) =                  equivalence relation ∼ on X is completely deter-
f (x)}, will be a cornerstone of our analysis of chaos          mined by q that associates the partitioning equiva-
generated by the iterates of non-injective maps, and            lence classes in X to unique points in S 1 , then ∼ is
the equivalence relation ∼f := {(x, y): f (x) = f (y)}          identical to the equivalence relation that is induced
generated by f is uniquely defined by the partition              by Q on X. Note that a larger size of the equivalence
that f induces on X. Of course as x ∼ x, x ∈ [x].               classes can be obtained by considering X = R + for
It is a simple matter to see that any two equiva-               which s ∼ t ⇔ |s − t| ∈ Z+ .
lence classes are either disjoint or equal so that the
equivalence classes generated by an equivalence re-             End Tutorial 1
lation on X form a disjoint cover of X. The quotient

4
 An alternate useful way of expressing these properties for a relation R on X are
(ER1) R is reflexive iff 1X ⊆ X
(ER2) R is symmetric iff R = R−1
(ER3) R is transitive iff R ◦ R ⊆ R,
with R an equivalence relation only if R ◦ R = R.
3152   A. Sengupta


            ¡                                                       α∈D M(Aα ) and M         α∈D Aα ⊆       α∈D M(Aα )
                                                                  where D is an index set. The following illustrates
                                                                  the difference between the two inverses of M. Let
                                                                  X be a set that is partitioned into two disjoint M-
                                       ¢                          invariant subsets X1 and X2 . If x ∈ X1 (or x ∈ X2 )
                                                                  then M(x) represents that part of X1 (or of X2 )
                                                                  that is realized immediately after one application
            §¥¡
           ¦ ¤                     £              © ¨            of M, while M− (x) denotes the possible precursors
                                                                  of x in X1 (or of X2 ) and M+ (B) is that subset of
                                                                  X whose image lies in B for any subset B ⊂ X.
                    Fig. 2.     The quotient map Q.                   In this paper the multifunctions that we shall
                                                                  be explicitly concerned with arise as the inverses of
                                                                  non-injective maps.
One of the central concepts that we consider and
                                                                      The second major component of our theory is
employ in this work is the inverse f − of a nonlin-
                                                                  the graphical convergence of a net of functions to
ear, non-injective, function f ; here the equivalence
                                                                  a multifunction. In Tutorial 2 below, we replace for
classes [x]f = f − f (x) of x ∈ X are the saturated
                                                                  the sake of simplicity and without loss of generality,
subsets of X that partition X. While a detailed
                                                                  the net (which is basically a sequence where the in-
treatment of this question in the form of the non-
                                                                  dex set is not necessarily the positive integers; thus
linear ill-posed problem and its solution is given in
                                                                  every sequence is a net but the family 5 indexed, for
Sec. 2 [Sengupta, 1997], it is sufficient to point out
                                                                  example, by Z, the set of all integers, is a net and
here from Figs. 1(c) and 1(d), that the inverse of a
                                                                  not a sequence) with a sequence and provide the
non-injective function is not a function but a mul-
                                                                  necessary background and motivation for the con-
tifunction while the inverse of a multifunction is a
                                                                  cept of graphical convergence.
non-injective function. Hence one has the general
result that
        f is a non-injective function
             ⇔ f − is a multifunction .                           Begin Tutorial 2: Convergence of
                                                          (2)
        f is a multifunction                                      Functions
             ⇔ f − is a non-injective function                    This Tutorial reviews the inadequacy of the usual
                                                                  notions of convergence of functions either to limit
The inverse of a multifunction M: X –→ Y is a gen-
                                      →                           functions or to distributions and suggests the mo-
eralization of the corresponding notion for a func-               tivation and need for introduction of the notion
tion f : X → Y such that                                          of graphical convergence of functions to multifunc-
                          def                                     tions. Here, we follow closely the exposition of
            M− (y) = {x ∈ X : y ∈ M(x)}
                                                                  Korevaar [1968], and use the notation (f k )∞ to de-
                                                                                                              k=1
leads to                                                          note real or complex valued functions on a bounded
                                                                  or unbounded interval J.
        M− (B) = {x ∈ X : M(x)                   B = ∅}
                                                                       A sequence of piecewise continuous functions
for any B ⊆ Y , while a more restricted inverse                   (fk )∞ is said to converge to the function f , nota-
                                                                       k=1
that we shall not be concerned with is given as                   tion fk → f , on a bounded or unbounded interval
M+ (B) = {x ∈ X : M(x) ⊆ B}. Obviously,                           J6
M+ (B) ⊆ M− (B). A multifunction is injective if                  (1) Pointwise if
x1 = x2 ⇒ M(x1 ) M(x2 ) = ∅, and commonly
with functions, it is true that M α∈D Aα =                                    fk (x) → f (x)        for all x ∈ J ,

5
  A function χ: D → X will be called a family in X indexed by D when reference to the domain D is of interest, and a net
when it is required to focus attention on its values in X.
6
  Observe that it is not being claimed that f belongs to the same class as (fk ). This is the single most important cornerstone
on which this paper is based: the need to “complete” spaces that are topologically “incomplete”. The classical high-school
example of the related problem of having to enlarge, or extend, spaces that are not big enough is the solution space of algebraic
equations with real coefficients like x2 + 1 = 0.
Toward a Theory of Chaos          3153

i.e. Given any arbitrary real number ε  0 there                           It is to be observed that apart from point-
exists a K ∈ N that may depend on x, such that                        wise and uniform convergences, all the other modes
|fk (x) − f (x)|  ε for all k ≥ K.                                   listed above represent some sort of an averaged con-
(2) Uniformly if                                                      tribution of the entire interval J and are therefore
                                                                      not of much use when pointwise behavior of the
        sup |f (x) − fk (x)| → 0                  as k → ∞ ,          limit f is necessary. Thus while limits in the mean
        x∈J
                                                                      are not unique, oscillating functions are tamed by
i.e. Given any arbitrary real number ε  0 there                      m-integral convergence for adequately large values
exists a K ∈ N, such that supx∈J |fk (x) − f (x)|  ε                 of m, and convergence relative to test functions,
for all k ≥ K.                                                        as we see below, can be essentially reduced to m-
(3) In the mean of order p ≥ 1 if |f (x) − f k (x)|p is               integral convergence. On the contrary, our graphical
integrable over J for each k                                          convergence — which may be considered as a point-
                                                                      wise biconvergence with respect to both the direct
              |f (x) − fk (x)|p → 0               as k → ∞ .          and inverse images of f just as usual pointwise con-
          J                                                           vergence is with respect to its direct image only
For p = 1, this is the simple case of convergence in                  — allows a sequence (in fact, a net) of functions to
the mean.                                                             converge to an arbitrary relation, unhindered by ex-
(4) In the mean m-integrally if it is possible to select              ternal influences such as the effects of integrations
indefinite integrals                                                   and test functions. To see how this can indeed mat-
                                            x              x1
                                                                      ter, consider the following
              (−m)
         fk          (x) = πk (x) +             dx1             dx2   Example 1.2. Let fk (x) = sin kx, k = 1, 2, . . . and
                                        c              c
                                     xm−1                             let J be any bounded interval of the real line. Then
                           ···              dxm fk (xm )              1-integrally we have
                                 c                                                                                         x
                                                                          (−1)          1          1
and                                                                     fk       (x) = − cos kx = − +                          sin kx1 dx1 ,
                                                                                        k          k                   0
                                            x              x1
          f (−m) (x) = π(x) +                   dx1             dx2   which obviously converges to 0 uniformly (and
                                        c              c              therefore in the mean) as k → ∞. And herein lies
                                     xm−1                             the point: even though we cannot conclude about
                           ···              dxm f (xm )               the exact nature of sin kx as k increases indefi-
                                 c
                                                                      nitely (except that its oscillations become more and
such that for some arbitrary real p ≥ 1,
                                                                      more pronounced), we may very definitely state that
                           (−m) p                                     limk→∞(cos kx)/k = 0 uniformly. Hence from
             |f (−m) − fk        | →0                 as k → ∞.
         J                                                                                                     x
                                                                              (−1)
                                                                             fk       (x) → 0 = 0 +                lim sin kx1 dx1
where the polynomials πk (x) and π(x) are of degree                                                           0 k→∞
 m, and c is a constant to be chosen appropriately.                  it follows that
(5) Relative to test functions ϕ if f ϕ and f k ϕ are
                                                                                            lim sin kx = 0                                 (3)
integrable over J and                                                                      k→∞

                                   ∞
                                                                      1-integrally.
     (fk − f )ϕ → 0 for every ϕ ∈ C0 (J) as k → ∞ ,                        Continuing with the same sequence of func-
 J
                                                                      tions, we now examine its test-functional conver-
        ∞
where C0 (J) is the class of infinitely differentiable                                               1
                                                                      gence with respect to ϕ ∈ C0 (−∞, ∞) that vanishes
continuous functions that vanish throughout some                      for all x ∈ (α, β). Integrating by parts,
                                                                                /
neighborhood of each of the end points of J. For                                  ∞                 β
an unbounded J, a function is said to vanish in                                        fk ϕ =           ϕ(x1 ) sin kx1 dx1
some neighborhood of +∞ if it vanishes on some                                    −∞            α
ray (r, ∞).                                                                                   1
    While pointwise convergence does not imply                                             = − [ϕ(x1 ) cos kx1 ]β
                                                                                                                α
                                                                                              k
any other type of convergence, uniform conver-                                                            β
gence on a bounded interval implies all the other                                                   1
                                                                                                −             ϕ (x1 ) cos kx1 dx1
convergences.                                                                                       k    α
3154    A. Sengupta


                               ©¦ £¦
                              ¨ §                         ¦                                               ( )                                                                          A¥£7
                                                                                                                                                                                           B@ 98
                                                                                           %                                                                      6                              C
                      ¤

                                                                 § £¦
                                                                                                                                ' 
                                                                                                                                                                                          A¥£7
                                                                                                                                                                                           B@ 98
                                                                                                                                                                                                 @




                                   ¢ £¡         ¢                                  #                                !                   $          4           0           2 31 0             6       5
                                ¥ 
                                ¤                                       ¤
                                        (a)                                                                      (b)                                                        (c)

Fig. 3. Incompleteness of function spaces. (a) demonstrates the classic example of non-completeness of the space of real-
valued continuous functions leading to the complete spaces Ln [a, b] whose elements are equivalence classes of functions with
                                   b
f ∼ g iff the Lebesgue integral a |f − g|n = 0. (b) and (c) illustrate distributional convergence of the functions fk (x) of
Eq. (5) to the Dirac delta δ(x) leading to the complete space of generalized functions. In comparison, note that the space
of continuous functions in the uniform metric C[a, b] is complete which suggests the importance of topologies in determining
convergence properties of spaces.



The first integrated term is 0 due to the condi-                                                                          converges in the mean to f (−m) ϕ(m) so that
tions on ϕ while the second also vanishes because                                                                                      β                      β
      1
ϕ ∈ C0 (−∞, ∞). Hence                                                                                                                                                 (−m) (m)
                                                                                                                                           fk ϕ = (−1)m           fk        ϕ
                                                                                                                                  α                           α
              ∞                                       β
                  fk ϕ → 0 =                              lim ϕ(x1 ) sin ksdx1                                                                                        β                                  β
         −∞                                           α k→∞                                                                                        → (−1)m                f (−m) ϕ(m) =                          f ϕ.
                                                                                                                                                                  α                                      α
for all ϕ, and leading to the conclusion that                                                                            In fact the converse also holds leading to the
                                                                                                                         following Equivalences between m-convergence in
                                      lim sin kx = 0                                                       (4)
                                     k→∞                                                                                 the mean and convergence with respect to test-
                                                                                                                         functions [Korevaar, 1968].
test-functionally.
                                                                                                                         Type 1 Equivalence. If f and (fk ) are functions
    This example illustrates the fact that if
                                                                                                                         on J that are integrable on every interior subinter-
Supp(ϕ) = [α, β] ⊆ J,7 integrating by parts suf-
                                                                                                                         val, then the following are equivalent statements.
ficiently large number of times so as to wipe out
the pathological behavior of (fk ) gives                                                                                 (a) For every interior subinterval I of J there is
                                                                                                                             an integer mI ≥ 0, and hence a smallest in-
                          β
       fk ϕ =                 fk ϕ                                                                                           teger m ≥ 0, such that certain indefinite inte-
                                                                                                                                    (−m)
  J                       α                                                                                                  grals fk    of the functions fk converge in the
                          β                                                     β                                            mean on I to an indefinite integral f (−m) ; thus
                                 (−1)                                                    (−m) m
              =               fk            ϕ = · · · = (−1)m                       fk             ϕ                              (−m)
                                                                                                                                       − f (−m) | → 0.
                          α                                                     α                                             I |fk
                                                                                                                                                              ∞
                                                                                                                         (b) J (fk − f )ϕ → 0 for every ϕ ∈ C0 (J).
              (−m)                                                  x       x                      x
where fk     (x) = πk (x) + c dx1 c 1 dx2 · · · c m−1
                                                                                                                         A significant generalization of this Equivalence is
dxm fk (xm ) is an m-times arbitrary indefinite in-
                                        β (−m)                                                                           obtained by dropping the restriction that the limit
tegral of fk . If now it is true that α fk         →                                                                     object f be a function. The need for this gener-
 β                                                                                  (−m) (m)
 α   f (−m) , then it must also be true that fk                                         ϕ                                alization arises because metric function spaces are

7                                                               ∞
  By definition, the support (or supporting interval) of ϕ(x) ∈ C0 [α, β] is [α, β] if ϕ and all its derivatives vanish for x ≤ α
and x ≥ β.
Toward a Theory of Chaos      3155

known not to be complete:         Consider the sequence            can be associated with the arbitrary indefinite
of functions [Fig. 3(a)]                                           integrals
                   
                    0,  if      a≤x≤0                                                    
                                                                                                  a≤x≤0
                   
                                                                                          0,
                                                                                          
                                            1
                                                                                         
                                                                                                          1
                                                                                         
          fk (x) = kx, if        0≤x≤
                                                                                          
                                                          (5)                   def (−1)
                                                                         Θk (x) = δk (x) = kx, 0  x 
                                           k                                                             k
                               1
                                                                                         
                                                                                                  1
                                                                                         
                    1,  if      ≤x≤b
                                                                                         
                                                                                           1,      ≤x≤b
                                                                                          
                               k                                                                  k
which is not Cauchy in the uniform metric
ρ(fj , fk ) = supa≤x≤b |fj (x) − fk (x)| but is Cauchy             of Fig. 3(c), which, as noted above, converge
                               b                                   in the mean to the unit step function Θ(x);
in the mean ρ(fj , fk ) = a |fj (x) − fk (x)|dx, or                        ∞            β           β (−1)
even pointwise. However in either case, (f k ) cannot              hence −∞ δk ϕ ≡ α δk ϕ = − α δk ϕ →
                                                                         β
converge in the respective metrics to a continuous                 − 0 ϕ (x)dx = ϕ(0). But there can be no func-
function and the limit is a discontinuous unit step                                                  β
                                                                   tional relation δ(x) for which α δ(x)ϕ(x)dx = ϕ(0)
function                                                           for all ϕ ∈ C0 1 [α, β], so that unlike in the case in

                       0, if a ≤ x ≤ 0                             Type 1 Equivalence, the limit in the mean Θ(x)
             Θ(x) =                                                                                (−1)
                       1, if 0  x ≤ b                             of the indefinite integrals δk (x) cannot be ex-
                                                                   pressed as the indefinite integral δ (−1) (x) of some
with graph ([a, 0], 0) ((0, b], 1), which is also in-
                                                                   function δ(x) on any interval containing the ori-
tegrable on [a, b]. Thus even if the limit of the se-
                                                                   gin. This leads to the second more general type of
quence of continuous functions is not continuous,
                                                                   equivalence.
both the limit and the members of the sequence
are integrable functions. This Riemann integration
                                                                   Type 2 Equivalence. If (fk ) are functions on J
is not sufficiently general, however, and this type
                                                                   that are integrable on every interior subinterval,
of integrability needs to be replaced by a much
                                                                   then the following are equivalent statements.
weaker condition resulting in the larger class of
the Lebesgue integrable complete space of functions                (a) For every interior subinterval I of J there is an
L[a, b].8                                                              integer mI ≥ 0, and hence a smallest integer
    The functions in Fig. 3(b),                                        m ≥ 0, such that certain indefinite integrals
                                                                         (−m)
                  k, if 0  x  1
                 
                                                                      fk     of the functions fk converge in the mean
                                   k                                   on I to an integrable function Θ which, unlike
                 
        δk (x) =                        1                              in Type 1 Equivalence, need not itself be an
                  0, x ∈ [a, b] − 0,       ,
                 
                 
                                        k                              indefinite integral of some function f .

8
 Both Riemann and Lebesgue integrals can be formulated in terms of the so-called step functions s(x), which are piecewise
constant functions with values (σi )I on a finite number of bounded subintervals (Ji )I
                                    i=1                                              i=1 (which may reduce to a point or
                                                                                                               defI
may not contain one or both of the end points) of a bounded or unbounded interval J, with integral J s(x)dx =     i=1 σi |Ji |.
While the Riemann integral of a bounded function f (x) on a bounded interval J is defined with respect to sequences
of step functions (sj )∞ and (tj )∞ satisfying sj (x) ≤ f (x) ≤ tj (x) on J with J (sj − tj ) → 0 as j → ∞ as
                       j=1          j=1
R J f (x)dx = lim J sj (x)dx = lim J tj (x)dx, the less restrictive Lebesgue integral is defined for arbitrary functions f
over bounded or unbounded intervals J in terms of Cauchy sequences of step functions J |si − sk | → 0, i, k → ∞, converging
to f (x) as

                                      sj (x) → f (x) pointwise almost everywhere on J ,

to be
                                                           def
                                                     f (x)dx = lim       sj (x)dx .
                                                 J               j→∞ J

That the Lebesgue integral is more general (and therefore is the proper candidate for completion of function spaces) is
illustrated by the example of the function defined over [0, 1] to be 0 on the rationals and 1 on the irrationals for which an
application of the definitions verify that while the Riemann integral is undefined, the Lebesgue integral exists and has value
1. The Riemann integral of a bounded function over a bounded interval exists and is equal to its Lebesgue integral. Because
it involves a larger family of functions, all integrals in integral convergences are to be understood in the Lebesgue sense.
3156     A. Sengupta

(b) ck (ϕ) =                                     ∞
                      fk ϕ → c(ϕ) for every ϕ ∈ C0 (J).                      system evolves to a state of maximal ill-posedness.
                  J
                                                                             The analysis is based on the non-injectivity, and
                                                         (−m)
Since we are now given that                      I   fk         (x)dx →      hence ill-posedness, of the map; this may be viewed
                                                      (−m) (m)               as a mathematical formulation of the stretch-and-
 I Ψ(x)dx, it must also be true that                 fk   ϕ        con-
verges in the mean to Ψϕ(m) whence                                           fold and stretch-cut-and-paste kneading operations
                                                                             of the dough that are well-established artifacts in
                                  (−m) (m)                                   the theory of chaos and the concept of maximal ill-
        fk ϕ = (−1)m            fk       ϕ
    J                       I                                                posedness helps in obtaining a physical understand-
                                                                             ing of the nature of chaos. We do this through the
        → (−1)m           Ψϕ(m) = (−1)m                  f (−m) ϕ(m)    .    fundamental concept of the graphical convergence of
                      I                              I
                                                                             a sequence (generally a net) of functions [Sengupta
The natural question that arises at this stage is                             Ray, 2000] that is allowed to converge graphically,
then: What is the nature of the relation (not func-                          when the conditions are right, to a set-valued map
tion any more) Ψ(x)? For this it is now stipulated,                          or multifunction. Since ill-posed problems naturally
despite the non-equality in the equation above, that                         lead to multifunctional inverses through functional
as in the mean m-integral convergence of (f k ) to a                         generalized inverses [Sengupta, 1997], it is natural
function f ,                                                                 to seek solutions of ill-posed problems in multifunc-
                                                x
                                (−1)      def                                tional space Multi(X, Y ) rather than in spaces of
         Θ(x) := lim δk                (x) =         δ(x )dx           (6)   functions Map(X, Y ); here Multi(X, Y ) is an ex-
                  k→∞                           −∞
                                                                             tension of Map(X, Y ) that is generally larger than
defines the non-functional relation (“generalized                             the smallest dense extension Multi | (X, Y ).
function”) δ(x) integrally as a solution of the inte-                             Feedback and iteration are natural processes by
gral equation (6) of the first kind; hence formally 9                         which nature evolves itself. Thus almost every pro-
                                        dΘ                                   cess of evolution is a self-correction process by which
                           δ(x) =                                      (7)
                                        dx                                   the system proceeds from the present to the future
                                                                             through a controlled mechanism of input and eval-
End Tutorial 2                                                               uation of the past. Evolution laws are inherently
                                                                             nonlinear and complex; here complexity is to be un-
                                                                             derstood as the natural manifestation of the non-
The above tells us that the “delta function” is not                          linear laws that govern the evolution of the system.
a function but its indefinite integral is the piecewise                            This paper presents a mathematical description
continuous function Θ obtained as the mean (or                               of complexity based on [Sengupta, 1997] and [Sen-
pointwise) limit of a sequence of non-differentiable                          gupta  Ray, 2000] and is organized as follows.
functions with the integral of dΘk (x)/dx being pre-                         In Sec. 1, we follow [Sengupta, 1997] to give an
served for all k ∈ Z+ . What then is the delta                               overview of ill-posed problems and their solution
(and not its integral)? The answer to this ques-                             that forms the foundation of our approach. Sec-
tion is contained in our multifunctional extension                           tions 2 to 4 apply these ideas by defining a chaotic
Multi(X, Y ) of the function space Map(X, Y ) con-                           dynamical system as a maximally ill-posed problem;
sidered in Sec. 3. Our treatment of ill-posed prob-                          by doing this we are able to overcome the limi-
lems is used to obtain an understanding and inter-                           tations of the three Devaney characterizations of
pretation of the numerical results of the discretized                        chaos [Devaney, 1989] that apply to the specific case
spectral approximation in neutron transport the-                             of iteration of transformations in a metric space,
ory [Sengupta, 1988, 1995]. The main conclusions                             and the resulting graphical convergence of func-
are the following: In a one-dimensional discrete sys-                        tions to multifunctions is the basic tool of our ap-
tem that is governed by the iterates of a nonlin-                            proach. Section 5 analyzes graphical convergence in
ear map, the dynamics is chaotic if and only if the                          Multi(X) for the discretized spectral approximation

9
  The observant reader cannot have failed to notice how mathematical ingenuity successfully transferred the “troubles” of
     ∞
(δk )k=1 to the sufficiently differentiable benevolent receptor ϕ so as to be able to work backward, via the resultant trouble free
  (−m)
(δk    )∞ , to the final object δ. This necessarily hides the true character of δ to allow only a view of its integral manifestation
        k=1
on functions. This unfortunately is not general enough in the strongly nonlinear physical situations responsible for chaos, and
is the main reason for constructing the multifunctional extension of function spaces that we use.
Toward a Theory of Chaos             3157

of neutron transport theory, which suggests a nat-       Example 2.1. As a non-trivial example of an in-
ural link between ill-posed problems and spectral        verse problem, consider the heat equation
theory of nonlinear operators. This seems to offer
an answer to the question of why a natural sys-                              ∂θ(x, t)      ∂ 2 θ(x, t)
                                                                                      = c2
tem should increase its complexity, and eventually                             ∂t              ∂x2
tend toward chaoticity, by becoming increasingly         for the temperature distribution θ(x, t) of a one-
nonlinear.                                               dimensional homogeneous rod of length L satisfy-
                                                         ing the initial condition θ(x, 0) = θ 0 (x), 0 ≤ x ≤ L,
2. Ill-Posed Problem and Its                             and boundary conditions θ(0, t) = 0 = θ(L, t), 0 ≤
   Solution                                              t ≤ T , having the Fourier sine-series solution
This section based on [Sengupta, 1997] presents                                      ∞
                                                                                                           nπ      2
a formulation and solution of ill-posed problems                 θ(x, t) =                    An sin          x e−λn t               (8)
                                                                                                           L
arising out of the non-injectivity of a function f :                                 n=1
X → Y between topological spaces X and Y . A
                                                         where λn = (cπ/a)n and
workable knowledge of this approach is necessary as
our theory of chaos leading to the characterization                                       a
                                                                             2                                nπ
of chaotic systems as being a maximally ill-posed                    An =                     θ0 (x ) sin        x       dx
                                                                             L        0                       L
state of a dynamical system is a direct application of
these ideas and can be taken to constitute a math-       are the Fourier expansion coefficients. While the di-
ematical representation of the familiar stretch-cut-     rect problem evaluates θ(x, t) from the differential
and paste and stretch-and-fold paradigms of chaos.       equation and initial temperature distribution θ 0 (x),
The problem of finding an x ∈ X for a given y ∈ Y         the inverse problem calculates θ0 (x) from the inte-
from the functional relation f (x) = y is an inverse     gral equation
problem that is ill-posed (or, the equation f (x) = y                    2       a
is ill-posed) if any one or more of the following con-      θT (x) =                 k(x, x )θ0 (x )dx ,             0 ≤ x ≤ L,
                                                                         L   0
ditions are satisfied.
                                                         when this final temperature θT is known, and
(IP1) f is not injective. This non-uniqueness prob-
lem of the solution for a given y is the single most                             ∞
                                                                                                nπ       nπ                     2
significant criterion of ill-posedness used in this           k(x, x ) =               sin          x sin    x              e−λn T
                                                                                                L        L
work.                                                                        n=1

(IP2) f is not surjective. For a y ∈ Y , this is the     is the kernel of the integral equation. In terms of
existence problem of the given equation.                 the final temperature the distribution becomes
(IP3) When f is bijective, the inverse f −1 is not
                                                                                 ∞
continuous, which means that small changes in y                                                        nπ      2
                                                               θT (x) =               Bn sin              x e−λn (t−T )              (9)
may lead to large changes in x.                                                                        L
                                                                             n=1

     A problem f (x) = y for which a solution exists,    with Fourier coefficients
is unique, and small changes in data y that lead                             2           a
                                                                                                             nπ
to only small changes in the solution x is said to                  Bn =                     θT (x ) sin        x        dx .
                                                                             L       0                       L
be well-posed or properly posed. This means that
f (x) = y is well-posed if f is bijective and the        In L2 [0, a], Eqs. (8) and (9) at t = T and t = 0
inverse f −1 : Y → X is continuous; otherwise the        yield respectively
equation is ill-posed or improperly posed. It is to                              ∞
be noted that the three criteria are not, in general,                      L                           2             2
                                                            θT (x)   2
                                                                         =                A2 e−2λn T ≤ e−2λ1 T θ0
                                                                                           n
                                                                                                                                2
                                                                                                                                    (10)
independent of each other. Thus if f represents a                          2
                                                                                 n=1
bijective, bounded linear operator between Banach
                                                                                 ∞
spaces X and Y , then the inverse mapping theo-                      2       L                     2
                                                               θ0        =                Bn e2λn T .
                                                                                           2
                                                                                                                                    (11)
rem guarantees that the inverse f −1 is continuous.                          2
                                                                                 n=1
Hence ill-posedness depends not only on the alge-
braic structures of X, Y , f but also on the topolo-     The last two equations differ from each other in
gies of X and Y .                                        the significant respect that whereas Eq. (10) shows
3158     A. Sengupta

that the direct problem is well-posed according to                 (b) For a linear operator A: Rn → Rm , m  n, sat-
(IP3), Eq. (11) means that in the absence of similar               isfying (1) and (2), the problem Ax = y reduces A
bounds the inverse problem is ill-posed. 10                        to echelon form with rank r less than min{m, n},
                                                                   when the given equations are consistent. The solu-
                                                                   tion however, produces a generalized inverse leading
Example 2.2. Consider the Volterra integral equa-
                                                                   to a set-valued inverse A− of A for which the inverse
tion of the first kind
                                                                   images of y ∈ R(A) are multivalued because of the
                               x                                   non-trivial null space of A introduced by assump-
                 y(x) =            r(x )dx = Kr                    tion (1). Specifically, a null-space of dimension n−r
                           a                                                                                  n
                                                                   is generated by the free variables {x j }j=r+1 which
                                                                   are arbitrary: this is illposedness of type (1). In ad-
where y, r ∈ C[a, b] and K: C[0, 1] → C[0, 1] is                   dition, m − r rows of the row reduced echelon form
the corresponding integral operator. Since the dif-                of A have all 0 entries that introduce restrictions
ferential operator D = d/dx under the sup-norm                                                  m
                                                                   on m − r coordinates {yi }i=r+1 of y which are now
  r = sup0≤x≤1 |r(x)| is unbounded, the inverse                                     r
                                                                   related to {yi }i=1 : this illustrates ill-posedness of
problem r = Dy for a differentiable function y                      type (2). Inverse ill-posed problems therefore gen-
on [a, b] is ill-posed, see Example 6.1. However,                  erate multivalued solutions through a generalized
y = Kr becomes well-posed if y is considered to be                 inverse of the mapping.
in C 1 [0, 1] with norm y = sup0≤x≤1 |Dy|. This il-
                                                                   (c) The eigenvalue problem
lustrates the importance of the topologies of X and
Y in determining the ill-posed nature of the prob-                             d2
lem when this is due to (IP3).                                                    + λ2 y = 0     y(0) = 0 = y(1)
                                                                              dx2
    Ill-posed problems in nonlinear mathematics of
type (IP1) arising from the non-injectivity of f                   has the following equivalence class of 0
can be considered to be a generalization of non-
                                                                                                           d2
uniqueness of solutions of linear equations as, for                   [0]D2 = {sin(πmx)}∞ ,
                                                                                        m=0       D2 =        + λ2     ,
example, in eigenvalue problems or in the solution of                                                     dx2
a system of linear algebraic equations with a larger
                                                                   as its eigenfunctions corresponding to the eigenval-
number of unknowns than the number of equations.
                                                                   ues λm = πm.
In both cases, for a given y ∈ Y , the solution set of
                                                                        Ill-posed problems are primarily of interest to
the equation f (x) = y is given by
                                                                   us explicitly as non-injective maps f , that is under
      f − (y) = [x]f = {x ∈ X : f (x ) = f (x) = y} .              the condition of (IP1). The two other conditions
                                                                   (IP2) and (IP3) are not as significant and play only
A significant point of difference between linear and
                                                                   an implicit role in the theory. In its application to
nonlinear problems is that unlike the special im-
                                                                   iterative systems, the degree of non-injectivity of f
portance of 0 in linear mathematics, there are no
                                                                   defined as the number of its injective branches, in-
preferred elements in nonlinear problems; this leads
                                                                   creases with iteration of the map. A necessary (but
to a shift of emphasis from the null space of linear
                                                                   not sufficient) condition for chaos to occur is the
problems to equivalence classes for nonlinear equa-
                                                                   increasing non-injectivity of f that is expressed de-
tions. To motivate the role of equivalence classes,
                                                                   scriptively in the chaos literature as stretch-and-fold
let us consider the null spaces in the following lin-
                                                                   or stretch-cut-and-paste operations. This increasing
ear problems.
                                                                   non-injectivity that we discuss in the following sec-
(a) Let f : R2 → R be defined by f (x, y) = x + y,                  tions, is what causes a dynamical system to tend
(x, y) ∈ R2 . The null space of f is generated by the              toward chaoticity. Ill-posedness arising from non-
equation y = −x on the x–y plane, and the graph                    surjectivity of (injective) f in the form of regular-
of f is the plane passing through the lines ρ = x                  ization [Tikhonov  Arsenin, 1977] has received
and ρ = y. For each ρ ∈ R the equivalence classes                  wide attention in the literature of ill-posed prob-
f − (ρ) = {(x, y) ∈ R2 : x + y = ρ} are lines on the               lems; this however is not of much significance in
graph parallel to the null set.                                    our work.

10
     Recall that for a linear operator continuity and boundedness are equivalent concepts.
Toward a Theory of Chaos     3159

            %¨§  #
            ¡$                        ¡ ¨§  #                                P
                                               
                                                  !                                                                            3 5) B
                                                                     6 @
                                          ¡ £   
                                                            £     6 GF
                                                                     8@
                      ¡ £   ©
                                              £
                                                                                                3 5 I1
                                                                                                    )
                                ¡ ¥£   ©
                                  ¤                                8 HF                                                       921ED
                                                                                                                               3 )
                                                       ¤ ¥£                                           4210(
                                                                                                     3 )
                                  ¡ ©¨§  ¦   '¨§  ¦
                                             ¡$                          A                                                              C
                                ¡¢                                                 8 95                                6 75
                                      (a)                                                                 (b)

Fig. 4. (a) Moore–Penrose generalized inverse. The decomposition of X and Y into the four fundamental subspaces of A
comprising the null space N (A), the column (or range) space R(A), the row space R(AT ) and N (AT ), the complement of
R(A) in Y , is a basic result in the theory of linear equations. The Moore–Penrose inverse takes advantage of the geometric
orthogonality of the row space R(AT ) and N (A) in Rn and that of the column space and N (AT ) in Rm . (b) When X and
Y are not inner-product spaces, a non-injective inverse can be defined by extending f to Y − R(f ) suitably as shown by
the dashed curve, where g(x) := r1 + ((r2 − r1 )/r1 )f (x) for all x ∈ D(f ) was taken to be a good definition of an extension
that replicates f in Y − R(f ); here x1 ∼ x2 under both f and g, and y1 ∼ y2 under {f, g} just as b is equivalent to
b in the Moore–Penrose case. Note that both {f, g} and {f − , g − } are both multifunctions on X and Y , respectively. Our
inverse G, introduced later in this section, is however injective with G(Y − R(f )) := 0.


                                                                     map a) is the noninjective map defined in terms of
                                                                     the row and column spaces of A, row(A) = R(A T ),
Begin Tutorial 3: Generalized                                        col(A) = R(A), as
Inverse
In this Tutorial, we take a quick look at the equation                                    def      (a|row(A) )−1 (y),         if y ∈ col(A)
a(x) = y, where a: X → Y is a linear map that need                     GMP (y) =
                                                                                                   0,                         if y ∈ N (AT ) .
not be either one-one or onto. Specifically, we will
take X and Y to be the Euclidean spaces R n and                                                                                              (12)
Rm so that a has a matrix representation A ∈ R m×n
where Rm×n is the collection of m×n matrices with                         Note that the restriction a|row(A) of a to R(AT )
real entries. The inverse A−1 exists and is unique iff                is bijective so that the inverse (a| row(A) )−1 is well-
m = n and rank(A) = n; this is the situation de-                     defined. The role of the transpose matrix appears
picted in Fig. 1(a). If A is neither one-one or onto,                naturally, and the GMP of Eq. (12) is the unique
then we need to consider the multifunction A − , a                   matrix that satisfies the conditions
functional choice of which is known as the general-
ized inverse G of A. A good introductory text for                                   AGMP A = A, GMP AGMP = GMP ,
                                                                                                                    (13)
generalized inverses is [Campbell  Mayer, 1979].                                 (GMP A)T = GMP A, (AGMP )T = AGMP
Figure 4(a) introduces the following definition of
the Moore–Penrose generalized inverse G MP .                         that follow immediately from the definition (12);
                                                                     hence GMP A and AGMP are orthogonal projec-
Definition 2.1 (Moore–Penrose Inverse).   If a:                       tions11 onto the subspaces R(AT ) = R(GMP ) and
Rn → Rm is a linear transformation with matrix                       R(A), respectively. Recall that the range space
representation A ∈ Rm×n then the Moore–Penrose                       R(AT ) of AT is the same as the row space row(A)
inverse GMP ∈ Rn×m of A (we will use the same                        of A, and R(A) is also known as the column space
notation GMP : Rm → Rn for the inverse of the                        of A, col(A).

11
     A real matrix A is an orthogonal projector iff A2 = A and A = AT .
3160   A. Sengupta

Example 2.3. For a: R5 → R4 , let                         rank is 4. This gives
                                                                           9         1        18        2
                                                                                                          
                1
                 
                       −3     2    1       2
                                                                                −                   − 
                                                                    
                                                                         275       275       275       55 
              3       −9    10    2       9                       
                                                                     −    27        3        54        6 
            A=
                                                                                         −               
              2       −6     4    2       4                             275       275       275      55 
                                                                   
                                                                                                          
                2      −6     8    1       7
                                                                          10        6        20       16 
                                                            GMP =  −                      −               
                                                                    
                                                                         143       143       143     143 
                                                                     238           57       476       59 
By reducing the augmented matrix (A|y) to the
                                                                               −                   −      
                                                                        3575      3575      3575      715 
                                                                                                          
row-reduced echelon form, it can be verified that
                                                                    
                                                                     129          106       258       47 
the null and range spaces of A are three- and two-                    −                   −
dimensional, respectively. A basis for the null space                   3575      3575      3575      715
                                                                                                            (14)
of AT and of the row and column space of A ob-
tained from the echelon form are respectively             as the Moore–Penrose inverse of A that readily ver-
                                                          ifies all the four conditions of Eqs. (13). The basic
                                                      point here is that, as in the case of a bijective map,
                       1         0
                     −3      0                       GMP A and AGMP are identities on the row and col-
   −2
                                            
          1         
                    
                             
                             
                                       
                                           1    0        umn spaces of A that define its rank. For later use —
  0   −1         0       1        0 1
 
      ,
       
             ; and  3
                   
                            , 1
                             
                                       ;
                                        
                                              ,  .
                                                       when we return to this example for a simpler inverse
  1  0                               2 0
 
                    
                     2
                             −
                              4                        G — given below are the orthonormal bases of the
    0     1         
                     1
                             
                              3
                                       
                                          −1    1        four fundamental subspaces with respect to which
                       2         4                        GMP is a representation of the generalized inverse of
                                                          A; these calculations were done by MATLAB. The
                                                          basis for
According to its definition Eq. (12), the Moore–
Penrose inverse maps the middle two of the above          (a) the column space of A consists of the first two
set to (0, 0, 0, 0, 0)T , and the A-image of the first         columns of the eigenvectors of AAT :
two (which are respectively (19, 70, 38, 51) T and
                                                                                                        T
(70, 275, 140, 205)T lying, as they must, in the span                       1633    363 3317 363
                                                                        −        ,−    ,    ,
of the last two), to the span of (1, −3, 2, 1, 2) T and                     2585    892 6387 892
(3, −9, 10, 2, 9)T because a restricted to this sub-                                                    T
                                                                           929 709 346      709
space of R5 is bijective. Hence                                       −       ,    ,     ,−
                                                                          1435 1319 6299    1319
                                            
                 1       0                                (b) the null space of AT consists of the last two
             −3   0 
                            −2
                                                             columns of the eigenvectors of AAT :
                                          1
             0  1 0
                                           
                                             −1                                                        T
                                                                          3185 293     3185 1777
       GMP A  3  A  1                                            −             ,−
                                           
                            1                                                ,            ,
                                              0
                                                
             2  −4                                                   8306 2493    4153 3547
                  
                                                
             1  3 0                      1                                                    T
                       
                                                                            323 533 323 1037
                                                                                ,   ,   ,
                 2       4                                                  1732 731 866 1911
                          
               1    0 0 0                                 (c) the row space of A consists of the first two
             −3
                   0 0 0                                   columns of the eigenvectors of AT A:
             0     1 0 0
                          
                                                                    421    44     569     659 1036
          = 3
            
                    1      .
                                                                       ,      ,−     ,−     ,
                                                                   13823 14895    918    2526 1401
             2 −4 0 0 
                          
                                                                  661 412   59      1523    303
             1     3                                                 ,    ,     ,−       ,−
                       0 0                                          690 1775 2960    10221    3974
               2    4
                                                          (d) the null space of A consists of the last three
The second matrix on the left is invertible as its            columns of AT A:
Toward a Theory of Chaos   3161

                  571     369 149      291    389        (T3) Arbitrary unions of members of U belong
             −         ,−    ,      ,−     ,−            to U.
                 15469    776 25344    350    1365
                      281 956 875      1279 409          Example 2.4
                 −       ,    ,     ,−     ,
                     1313 1489 1706    2847 1473
                                                         (1) The smallest topology possible on a set X is
                   292    876 203 621 1157                   its indiscrete topology when the only open sets
                       ,−     ,   ,    ,
                  1579    1579 342 4814 2152                 are ∅ and X; the largest is the discrete topology
The matrices Q1 and Q2 with these eigenvectors               where every subset of X is open (and hence also
(xi ) satisfying xi = 1 and (xi , xj ) = 0 for i = j         closed).
as their columns are orthogonal matrices with the        (2) In a metric space (X, d), let Bε (x, d) = {y ∈ X:
simple inverse criterion Q−1 = QT .                          d(x, y)  ε} be an open ball at x. Any subset
                                                             U of X such that for each x ∈ U there is a d-
                                                             ball Bε (x, d) ⊆ U in U , is said to be an open
End Tutorial 3                                               set of (X, d). The collection of all these sets
                                                             is the topology induced by d. The topological
                                                             space (X, U) is then said to be associated with
The basic issue in the solution of the inverse ill-          (induced by) (X, d).
posed problem is its reduction to an well-posed one      (3) If ∼ is an equivalence relation on a set X, the
when restricted to suitable subspaces of the do-             set of all saturated sets [x]∼ = {y ∈ X: y ∼ x}
main and range of A. Considerations of geometry              is a topology on X; this topology is called the
leading to their decomposition into orthogonal sub-          topology of saturated sets.
spaces is only an additional feature that is not cen-             We argue in Sec. 4.2 that this constitutes
tral to the problem: recall from Eq. (1) that any            the defining topology of a chaotic system.
function f must necessarily satisfy the more general     (4) For any subset A of the set X, the A-inclusion
set-theoretic relations f f −f = f and f − f f − = f −       topology on X consists of ∅ and every superset
of Eq. (13) for the multiinverse f − of f : X → Y .          of A, while the A-exclusion topology on X con-
The second distinguishing feature of the MP-inverse          sists of all subsets of X − A. Thus A is open
is that it is defined, by a suitable extension, on all        in the inclusion topology and closed in the ex-
of Y and not just on f (X) which is perhaps more             clusion, and in general every open set of one is
natural. The availability of orthogonality in inner-         closed in the other.
product spaces allows this extension to be made                   The special cases of the a-inclusion and a-
in an almost normal fashion. As we shall see be-             exclusion topologies for A = {a} are defined in
low the additional geometric restriction of Eq. (13)         a similar fashion.
is not essential to the solution process, and in-        (5) The cofinite and cocountable topologies in which
fact, only results in a less canonical form of the           the open sets of an infinite (resp. uncount-
inverse.                                                     able) set X are respectively the complements
                                                             of finite and countable subsets, are examples of
                                                             topologies with some unusual properties that
                                                             are covered in Appendix A.1. If X is itself
                                                             finite (respectively, countable), then its cofinite
Begin Tutorial 4: Topological Spaces
                                                             (respectively, cocountable) topology is the dis-
This Tutorial is meant to familiarize the reader with        crete topology consisting of all its subsets. It is
the basic principles of a topological space. A topo-         therefore useful to adopt the convention, unless
logical space (X, U) is a set X with a class 12 U of         stated to the contrary, that cofinite and co-
distinguished subsets, called open sets of X, that           countable spaces are respectively infinite and
satisfy                                                      uncountable.
(T1) The empty set ∅ and the whole X belong to U             In the space (X, U), a neighborhood of a point
(T2) Finite intersections of members of U belong         x ∈ X is a nonempty subset N of X that con-
to U                                                     tains an open set U containing x; thus N ⊆ X is a

12
     In this sense, a class is a set of sets.
3162   A. Sengupta

neighborhood of x iff                                    neighborhood system at x coincides exactly with
                     x∈U ⊆N                     (15)    the assigned collection Nx ; compare with Defini-
                                                        tion A.1.1. Neighborhoods in topological spaces are
for some U ∈ U. The largest open set that can be        a generalization of the familiar notion of distances
used here is Int(N ) (where, by definition, Int(A) is    of metric spaces that quantifies “closeness” of points
the largest open set that is contained in A) so that    of X.
the above neighborhood criterion for a subset N of          A neighborhood of a non-empty subset A of X
X can be expressed in the equivalent form               that will be needed later on is defined in a similar
N ⊆ X is a U − neighborhood of x iff x ∈ Int U (N )      manner: N is a neighborhood of A iff A ⊆ Int(N ),
                                             (16)       that is A ⊆ U ⊆ N ; thus the neighborhood sys-
                                                        tem at A is given by NA = a∈A Na := {G ⊆ X:
implying that a subset of (X, U) is a neighborhood      G ∈ Na for every a ∈ A} is the class of common
of all its interior points, so that N ∈ N x ⇒ N ∈ Ny    neighborhoods of each point of A.
for all y ∈ Int(N ). The collection of all neighbor-        Some examples of neighborhood systems at a
hoods of x                                              point x in X are the following:
       def
  Nx = {N ⊆ X : x ∈ U ⊆ N for some U ∈ U}               (1) In an indiscrete space (X, U), X is the only
                                        (17)                neighborhood of every point of the space; in a
is the neighborhood system at x, and the subcol-            discrete space any set containing x is a neigh-
lection U of the topology used in this equation             borhood of the point.
constitutes a neighborhood (local ) base or basic       (2) In an infinite cofinite (or uncountable cocount-
neighborhood system, at x, see Definition A.1.1 of           able) space, every neighborhood of a point is an
Appendix A.1. The properties                                open neighborhood of that point.
                                                        (3) In the topology of saturated sets under the
(N1) x belongs to every member N of Nx ,                    equivalence relation ∼, the neighborhood sys-
(N2) The intersection of any two neighborhoods of           tem at x consists of all supersets of the equiva-
x is another neighborhood of x: N, M ∈ N x ⇒                lence class [x]∼ .
N M ∈ Nx ,                                              (4) Let x ∈ X. In the x-inclusion topology, N x
(N3) Every superset of any neighborhood of x is a           consists of all the non-empty open sets of X
neighborhood of x: (M ∈ Nx ) ∧ (M ⊆ N ) ⇒ N ∈               which are the supersets of {x}. For a point
Nx ,                                                        y = x of X, Ny are the supersets of {x, y}.
that characterize Nx completely are a direct conse-          For any given class T S of subsets of X, a unique
quence of the definitions (15), (16) that may also       topology U(T S) can always be constructed on X
be stated as                                            by taking all finite intersections T S∧ of members
                                                        of S followed by arbitrary unions T S∧∨ of these fi-
(N0) Any neighborhood N ∈ Nx contains another
                                                        nite intersections. U(T S) := T S∧∨ is the smallest
neighborhood U of x that is a neighborhood of each
                                                        topology on X that contains T S and is said to be
of its points: ((∀N ∈ Nx )(∃U ∈ Nx )(U ⊆ N )) :
                                                        generated by T S. For a given topology U on X satis-
(∀y ∈ U ⇒ U ∈ Ny ).
                                                        fying U = U(T S), T S is a subbasis, and T S∧ := T B
    Property (N0) infact serves as the defining char-    a basis, for the topology U; for more on topological
acteristic of an open set, and U can be identified       basis, see Appendix A.1. The topology generated
with the largest open set Int(N ) contained in N ;      by a subbase essentially builds not from the collec-
hence a set G in a topological space is open iff it is   tion T S itself but from the finite intersections T S∧
a neighborhood of each of its points. Accordingly if    of its subsets; in comparison the base generates a
Nx is a given class of subsets of X associated with     topology directly from a collection T S of subsets
each x ∈ X satisfying (N1)–(N3), then (N0) defines       by forming their unions. Thus whereas any class of
the special class of neighborhoods G                    subsets can be used as a subbasis, a given collection
       U = {G ∈ Nx : x ∈ B ⊆ G for all x ∈ G            must meet certain qualifications to pass the test of a
                                                        base for a topology: these and related topics are cov-
           and some basic nbd B ∈ Nx }       (18)       ered in Appendix A.1. Different subbases, therefore,
as the unique topology on X that contains a basic       can be used to generate different topologies on the
neighborhood of each of its points, for which the       same set X as the following examples for the case of
Toward a Theory of Chaos   3163

X = R demonstrates; here (a, b), [a, b), (a, b] and                 consisting of those points of X that are in A but
[a, b], for a ≤ b ∈ R, are the usual open-closed inter-             not in its boundary, Int(A) = A − Bdy(A), is the
vals in R.13 The subbases T S1 = {(a, ∞), (−∞, b)},                 largest open subset of X that is contained in A.
T S2 = {[a, ∞), (−∞, b)}, T S3 = {(a, ∞), (−∞, b]}                  Hence it follows that Int(Bdy(A)) = ∅, the bound-
and T S4 = {[a, ∞), (−∞, b]} give the respective                    ary of A is the intersection of the closures of A and
bases T B1 = {(a, b)}, T B2 = {[a, b)}, T B3 = {(a, b]}             X − A, and a subset N of X is a neighborhood of
and T B4 = {[a, b]}, a ≤ b ∈ R, leading to the stan-                x iff x ∈ Int(N ).
dard (usual ), lower limit (Sorgenfrey), upper limit,
and discrete (take a = b) topologies on R. Bases of                     The three subsets Int(A), Bdy(A) and exterior
the type (a, ∞) and (−∞, b) provide the right and                   of A defined as Ext(A) := Int(X − A) = X − Cl(A),
left ray topologies on R.                                           are pairwise disjoint and have the full space X as
                                                                    their union.
       This feasibility of generating different
       topologies on a set can be of great practi-                  Definition 2.3 (Derived and Isolated sets).   Let A
       cal significance because open sets determine                  be a subset of X. A point x ∈ X (which may or
       convergence characteristics of nets and con-                 may not be a point of A) is a cluster point of A if
       tinuity characteristics of functions, thereby                every neighborhood N ∈ Nx contains at least one
       making it possible for nature to play around                 point of A different from x. The derived set of A
       with the structure of its working space in its
                                                                                 def
       kitchen to its best possible advantage. 14                       Der(A) = x ∈ X : (∀N ∈ Nx ) N           (A−{x}) = ∅

Here are a few essential concepts and terminology                                                                            (22)
for topological spaces.                                             is the set of all cluster points of A. The complement
                                                                    of Der(A) in A
Definition 2.2 (Boundary, Closure, Interior).   The
                                                                                       def
boundary of A in X is the set of points x ∈ X such                        Iso(A) = A − Der(A) = Cl(A) − Der(A)               (23)
that every neighborhood N of x intersects both A
and X–A:                                                            are the isolated points of A to which no proper
                                                                    sequence in A converges, that is there exists a neigh-
                   def
      Bdy(A) = {x ∈ X : (∀N ∈ Nx )((N               A=∅             borhood of any such point that contains no other
                                                                    point of A so that the only sequence that converges
                         ∧(N   (X − A) = ∅))}              (19)     to a ∈ Iso(A) is the constant sequence (a, a, a, . . .).
                                                                        Clearly,
where Nx is the neighborhood system of Eq. (17)
at x.                                                                    Cl(A) = A           Der(A) = A     Bdy(A)
    The closure of A is the set of all points x ∈ X
such that each neighborhood of x contains at least                                = Iso(A)      Der(A) = Int(A)      Bdy(A)
one point of A that may be x itself. Thus the set
                                                                    with the last two being disjoint unions, and A is
            def
 Cl(A) = {x ∈ X : (∀N ∈ Nx )(N                  A = ∅)} (20)        closed iff A contains all its cluster points, Der(A) ⊆
                                                                    A, iff A contains its closure. Hence
of all points in X adherent to A is given by the
                                                                               A = Cl(A) ⇔ Cl(A)
union of A with its boundary.
    The interior of A                                                            = {x ∈ A : ((∃N ∈ Nx )(N ⊆ A))
                  def
      Int(A) = {x ∈ X : (∃N ∈ Nx )(N ⊆ A)}                 (21)                        ∨((∀N ∈ Nx )(N     (X − A) = ∅))} .

13
     By definition, an interval I in a totally ordered set X is a subset of X with the property
                                         (x1 , x2 ∈ I) ∧ (x3 ∈ X : x1     x3    x2 ) ⇒ x3 ∈ I
so that any element of X lying between two elements of I also belongs to I.
14
   Although we do not pursue this point of view here, it is nonetheless tempting to speculate that the answer to the question
“Why does the entropy of an isolated system increase?” may be found by exploiting this line of reasoning that seeks to explain
the increase in terms of a visible component associated with the usual topology as against a different latent workplace topology
that governs the dynamics of nature.
3164    A. Sengupta

 Comparison of Eqs. (19) and (22) also makes it           (g)   Cl(A) =     {F ⊆ X : F
 clear that Bdy(A) ⊆ Der(A). The special case of                          is a closed set of X containing A}
 A = Iso(A) with Der(A) ⊆ X − A is important
                                                                                                           (25)
 enough to deserve a special mention:
                                                                A straightforward consequence of property (b)
 Definition 2.4 (Donor set).      A proper, nonempty        is that the boundary of any subset A of a topolog-
 subset A of X such that Iso(A) = A with Der(A) ⊆          ical space X is closed in X; this significant result
 X − A will be called self-isolated or donor. Thus se-     may also be demonstrated as follows. If x ∈ X is not
 quences eventually in a donor set converges only          in the boundary of A there is some neighborhood
 in its complement; this is, the opposite of the           N of x that does not intersect both A and X − A.
 characteristic of a closed set where all converging       For each point y ∈ N , N is a neighborhood of that
 sequences eventually in the set must necessarily          point that does not meet A and X − A simultane-
 converge in it. A closed-donor set with a closed          ously so that N is contained wholly in X − Bdy(A).
 neighbor has no derived or boundary sets, and will        We may now take N to be open without any loss of
 be said to be isolated in X.                              generality implying thereby that X − Bdy(A) is an
                                                           open set of X from which it follows that Bdy(A) is
 Example 2.5. In an isolated set sequences con-
                                                           closed in X.
                                                                Further material on topological spaces relevant
 verge, if they have to, simultaneously in the com-
                                                           to our work can be found in Appendix A.3.
 plement (because it is donor) and in it (because it is
 closed). Convergent sequences in such a set can only      End Tutorial 4
 be constant sequences. Physically, if we consider ad-
 herents to be contributions made by the dynamics of
 the corresponding sequences, then an isolated set is      Working in a general topological space, we now re-
 secluded from its neighbor in the sense that it nei-      call the solution of an ill-posed problem f (x) = y
 ther receives any contributions from its surround-        [Sengupta, 1997] that leads to a multifunctional in-
 ings, nor does it give away any. In this light and        verse f − through the generalized inverse G. Let
 terminology, a closed set is a selfish set (recall that    f : (X, U) → (Y, V) be a (nonlinear) function be-
 a set A is closed in X iff every convergent net of X       tween two topological space (X, U) and (Y, V) that
 that is eventually in A converges in A; conversely a      is neither one-one or onto. Since f is not one-
 set is open in X iff the only nets that converge in A      one, X can be partitioned into disjoint equiva-
 are eventually in it), whereas a set with a derived       lence classes with respect to the equivalence relation
 set that intersects itself and its complement may be      x1 ∼ x2 ⇔ f (x1 ) = f (x2 ). Picking a representative
 considered to be neutral. Appendix A.3 shows the          member from each of the classes (this is possible
 various possibilities for the derived set and bound-      by the Axiom of Choice; see the following Tuto-
 ary of a subset A of X.                                   rial) produces a basic set XB of X; it is basic as it
                                                           corresponds to the row space in the linear matrix
      Some useful properties of these concepts for         example which is all that is needed for taking an
 a subset A of a topological space X are the               inverse. XB is the counterpart of the quotient set
                                                           X/ ∼ of Sec. 1, with the important difference that
 following.
                                                           whereas the points of the quotient set are the equiv-
(a) BdyX (X) = ∅,                                          alence classes of X, XB is a subset of X with each
(b) Bdy(A) = Cl(A) Cl(X − A),                              of the classes contributing a point to X B . It then
                                                           follows that fB : XB → f (X) is the bijective re-
(c) Int(A) = X − Cl(X − A) = A − Bdy(A) =
                                                           striction a|row(A) that reduces the original ill-posed
    Cl(A) − Bdy(A),
                                                           problem to a well-posed one with XB and f (X)
(d) Int(A) Bdy(A) = ∅,
                                                           corresponding respectively to the row and column
(e) X = Int(A) Bdy(A) Int(X − A),                                                −1
                                                           spaces of A, and fB : f (X) → XB is the ba-
(f)     Int(A) =       {G ⊆ X : G                          sic inverse from which the multiinverse f − is ob-
                                                           tained through G, which in turn corresponds to the
                   is an open set of X contained in A}
                                                           Moore–Penrose inverse GMP . The topological con-
                                                   (24)    siderations (obviously not for inner product spaces
Toward a Theory of Chaos    3165

that applies to the Moore–Penrose inverse) needed               of the choice of the single element π from the re-
to complete the solution are discussed below and in             als. To see this more closely in the context of maps
Appendix A.1.                                                   that we are concerned with, let f : X → Y be a
                                                                non-injective, onto map. To construct a functional
                                                                right inverse fr : Y → X of f , we must choose, for
                                                                each y ∈ Y one representative element x rep from
Begin Tutorial 5: Axiom of Choice                               the set f − (y) and define fr (y) to be that element
and Zorn’s Lemma                                                according to f ◦ fr (y) = f (xrep ) = y. If there is
Since some of our basic arguments depend on it,                 no preferred or natural way to make this choice,
this Tutorial contains a short description of the               the axiom of choice allows us to make an arbitrary
Axiom of Choice that has been described as “one                 selection from the infinitely many that may be pos-
of the most important, and at the same time one                 sible from f − (y). When a natural choice is indeed
of the most controversial, principles of mathemat-              available, as for example in the case of the initial
ics”. What this axiom states is this: For any set X             value problem y (x) = x; y(0) = α0 on [0, a], the
there exists a function fC : P0 (X) → X such that               definite solution α0 +x2 /2 may be selected from the
                                                                                   x
fC (Aα ) ∈ Aα for every non-empty subset Aα of X;               infinitely many 0 x dx = α + x2 /2, 0 ≤ x ≤ a that
here P0 (X) is the class of all subsets of X except ∅.          are permissible, and the axiom of choice sanctions
Thus, if X = {x1 , x2 , x3 } is a three element set, a          this selection. In addition, each y ∈ Y gives rise to
possible choice function is given by                            the family of solution sets Ay = {f − (y) : y ∈ Y }
                                                                and the real power of the axiom is its assertion that
    fC ({x1 , x2 , x3 }) = x3 ,     fC ({x1 , x2 }) = x1 ,
                                                                it is possible to make a choice fC (Ay ) ∈ Ay on every
      fC ({x2 , x3 }) = x3 ,      fC ({x3 , x1 }) = x3 ,        Ay simultaneously; this permits the choice on every
fC ({x1 }) = x1 ,    fC ({x2 }) = x2 ,      fC ({x3 }) = x3 .   Ay of the collection to be made at the same time.

It must be appreciated that the axiom is only an ex-
istence result that asserts every set to have a choice
                                                                Pause Tutorial 5
function, even when nobody knows how to construct
one in a specific case. Thus, for example, how does
                                         √
one pick out the isolated irrationals 2 or π from               Figure      shows our formulation and solution
the uncountable reals? There is no doubt that they              [Sengupta, 1997] of the inverse ill-posed problem
do exist, for we can construct a right-angled trian-            f (x) = y. In sub-diagram X−XB −f (X), the surjec-
gle with sides of length 1 or a circle of radius 1. The         tion p : X → XB is the counterpart of the quotient
axiom tells us that these choices are possible even             map Q of Fig. 2 that is known in the present con-
though we do not know how exactly to do it; all                 text as the identification of X with X B (as it iden-
that can be stated with confidence is that we can                tifies each saturated subset of X with its represen-
actually pick up rationals arbitrarily close to these           tative point in XB ), with the space (XB , FT{U; p})
irrationals.                                                    carrying the identification topology FT{U; p} being
     The axiom of choice is essentially meaningful              known as an identification space. By sub-diagram
when X is infinite as illustrated in the last two ex-            Y − XB − f (X), the image f (X) of f gets the
amples. This is so because even when X is denu-                 subspace topology15 IT{j; V} from (Y, V) by the in-
merable, it would be physically impossible to make              clusion j : f (X) → Y when its open sets are
an infinite number of selections either all at a time            generated as, and only as, j −1 (V ) = V f (X)
or sequentially: the Axiom of Choice nevertheless               for V ∈ V. Furthermore if the bijection f B con-
tells us that this is possible. The real strength and           necting XB and f (X) (which therefore acts as a
utility of the Axiom however is when X and some                 1 : 1 correspondence between their points, imply-
or all of its subsets are uncountable as in the case            ing that these sets are set-theoretically identical

15
   In a subspace A of X, a subset UA of A is open iff UA = A U for some open set U of X. The notion of subspace topology
can be formalized with the help of the inclusion map i : A → (X, U) that puts every point of A back to where it came from,
thus
                                                 UA = {UA = A U : U ∈ U}
                                                    = {i− (U ) : U ∈ U}.
3166   A. Sengupta

                                                                      indistinguishable which may be considered to be
        ¦
         ©                                            ¨¦
                                                        © §
                                   ¡ ¢                                identical in as far as their topological properties
                                                                      are concerned.
                      D 5                    4D
                                                                      Remark.    It may be of some interest here to spec-
        1 C                                              B            ulate on the significance of ininality in our work.
                             !                                        Physically, a map f : (X, U) → (Y, V) between two
                                                                      spaces can be taken to represent an interaction be-
                                                                      tween them and the algebraic and topological char-
  5420('%© ¥ ¦
  3 1)  $ #                     ¥              5A@9('6¦   ¦
                                                   3 ) 8 $7©     acters of f determine the nature of this interaction.
                                 £¤¡ ¥                                A simple bijection merely sets up a correspondence,
                                                                      that is an interaction, between every member of X
                                                                      with some member Y , whereas a continuous map
Fig. 5. Solution of ill-posed problem f (x) = y, f : X → Y .
                                                                      establishes the correspondence among the special
G : Y → XB , a generalized inverse of f because of f Gf = f
                                                                      category of “open” sets. Open sets, as we see in
and Gf G = G which follows from the commutativity of the
                                                                      Appendix A.1, are the basic ingredients in the the-
diagrams, is a functional selection of the multi-inverse f − :
                                                                      ory of convergence of sequences, nets and filters, and
(Y, V) –→ (X, U)  f and f are the injective and surjective
         →
                                                                      the characterization of open sets in terms of conver-
restrictions of f ; these will be topologically denoted by their
                                                                      gence, namely that a set G in X is open in it if every
generic notations e and q, respectively.
                                                                      net or sequence that converges in X to a point in G
                                                                      is eventually in G, see Appendix A.1, may be inter-
except for their names) is image continuous, then                     preted to mean that such sets represent groupings
by Theorem A.2.1 of Appendix 2, so is the asso-                       of elements that require membership of the group
ciation q = fB ◦ p : X → f (X) that associates                        before permitting an element to belong it; an open
saturated sets of X with elements of f (X); this                      set unlike its complement the closed or selfish set,
makes f (X) look like an identification space of X                     however, does not forbid a net that has been even-
by assigning to it the topology FT{U; q}. On the                      tually in it to settle down in its selfish neighbor,
other hand if fB happens to be preimage continu-                      who nonetheless will never allow such a situation to
ous, then XB acquires, by Theorem A.2.2, the initial                  develop in its own territory. An ininal map forces
topology IT{e; V} by the embedding e : X B → Y                        these well-defined and definite groups in (X, U) and
that embeds XB into Y through j ◦ fB , making                         (Y, V) to interact with each other through f ; this is
it look like a subspace of Y .16 In this dual situa-                  not possible with simple continuity as there may be
tion, fB has the highly interesting topological prop-                 open sets in X that are not derived from those of
erty of being simultaneously image and preimage                       Y and non-open sets in Y whose inverse images are
continuous when the open sets of XB and f (X)                         open in X. It is our hypothesis that the driving force
                             −1                                       behind the evolution of a system represented by the
— which are simply the fB -images of the open
sets of f (X) which, in turn, are the f B -images                     input–output relation f (x) = y is the attainment
of these saturated open sets — can be considered                      of the ininal triple state (X, f, Y ) for the system.
to have been generated by fB , and are respec-                        A preliminary analysis of this hypothesis is to be
tively the smallest and largest collection of sub-                    found in Sec. 4.2.
sets of X and Y that makes fB ini(tial-fi)nal con-                          For ininality of the interaction, it is therefore
tinuous [Sengupta, 1997]. A bijective ininal func-                    necessary to have
tion such as fB is known as a homeomorphism                                       FT{U; f } = IT{j; V}
and ininality for functions that are neither 1 : 1                                                                        (26)
nor onto is a generalization of homeomorphism for                                  IT{ f ; V} = FT{U; p}} ;
bijections; refer Eqs. (A.47) and (A.48) for a set-                   in what follows we will refer to the injective and sur-
theoretic formulation of this distinction. A homeo-                   jective restrictions of f by their generic topological
morphism f : (X, U) → (Y, V) renders the home-                        symbols of embedding e and association q, respec-
omorphic spaces (X, U) and (Y, V) topologically                       tively. What are the topological characteristics of f

16
  A surjective function is an association iff it is image continuous and an injective function is an embedding iff it is preimage
continuous.
Toward a Theory of Chaos         3167

in order that the requirements of Eq. (26) be met?
                                                                                               £




From Appendix A.1, it should be clear by super-
posing the two parts of Fig. 21 over each other that                               ¡




given q : (X, U) → (f (X), FT{U; q}) in the first of
                                                                               ¤




these equations, IT{j; V} will equal FT{U; q} iff j                                         ¥




is an ininal open inclusion and Y receives FT{U; f }.                                  ¦




In a similar manner, preimage continuity of e re-
quires p to be open ininal and f to be preimage con-
tinuous if the second of Eq. (26) is to be satisfied.
Thus under the restrictions imposed by Eq. (26),
the interaction f between X and Y must be such                                                             ¥   ¡            §           £




as to give X the smallest possible topology of f -                                                     ¤
                                                                                                               ¢        ¢




saturated sets and Y the largest possible topology
of images of all these sets: f , under these condi-                                             
                                                                                                 2x,                                       0 ≤ x  3/8
tions, is an ininal transformation. Observe that a                                              
                                                                 Fig. 6.    The function f (x) = 3/4,                                       3/8 ≤ x ≤ 5/8
direct application of parts (b) of Theorems A.2.1                                               
                                                                                                 7/6 − 2x/3,                               5/8  x ≤ 1.
and A.2.2 to Fig. implies that Eq. (26) is satisfied
iff fB is ininal, that is iff it is a homeomorphism.
Ininality of f is simply a reflection of this as it is
neither 1 : 1 nor onto.                                               An injective branch of a function f in this work
      The f - and p-images of each saturated set of X            refers to the restrictions fB and its associated in-
                                                                         −1
are singletons in Y (these saturated sets in X arose,            verse fB .
in the first place, as f − ({y}) for y ∈ Y ) and in XB ,               The following example of an inverse ill-posed
respectively. This permits the embedding e = j ◦ f B             problem will be useful in fixing the notations intro-
to give XB the character of a virtual subspace of Y              duced above. Let f on [0, 1] be the function shown
just as i makes f (X) a real subspace. Hence the in-             below.
verse images p− (xr ) = f − (e(xr )) with xr ∈ XB , and               Then f (x) = y is well-posed for [0, 1/4), and ill-
q − (y) = f − (i(y)) with y = fB (xr ) ∈ f (X) are the           posed in [1/4, 1]. There are two injective branches
same, and are just the corresponding f − images via              of f in {[1/4, 3/8) (5/8, 1]}, and f is constant
the injections e and i, respectively. G, a left inverse          ill-posed in [3/8, 5/8]. Hence the basic component
of e, is a generalized inverse of f . G is a general-            fB of f can be taken to be fB (x) = 2x for x ∈
                                                                                                  −1
ized inverse because the two set-theoretic defining               [0, 3/8) having the inverse fB (y) = x/2 with
requirements of f Gf = f and Gf G = G for the                    y ∈ [0, 3/4]. The generalized inverse is obtained
generalized inverse are satisfied, as Fig. shows, in              by taking [0, 3/4] as a subspace of [0, 1], while the
the following forms                                              multiinverse f − follows by associating with every
                                                                 point of the basic domain [0, 1]B = [0, 3/8], the re-
             jfB Gf = f        GjfB G = G .                      spective equivalent points [3/8]f = [3/8, 5/8] and
                                                                 [x]f = {x, 7/4 − 3x}forx ∈ [1/4, 3/8). Thus the in-
In fact the commutativity embodied in these equali-              verses G and f − of f are17
ties is self evident from the fact that e = if B is a left
inverse of G, that is eG = 1Y . On putting back XB
                                                                                   y,                                          3
                                                                                  
into X by identifying each point of X B with the set                              
                                                                                  2                           y ∈ 0,
it came from yields the required set-valued inverse                                                                             4
                                                                           G(y) =                                                   ,
f − , and G may be viewed as a functional selection                                                                   3
                                                                                   0,                         y∈        ,1
of the multiinverse f − .
                                                                                  
                                                                                                                       4
17
  If y ∈ R(f ) then f − ({y}) := ∅ which is true for any subset of Y − R(f ). However from the set-theoretic definition of
       /
natural numbers that requires 0 := ∅, 1 = {0}, 2 = {0, 1} to be defined recursively, it follows that f − (y) can be identified
with 0 whenever y is not in the domain of f − . Formally, the successor set A+ = A {A} of A can be used to write 0 := ∅,
1 = 0+ = 0 {0}, 2 = 1+ = 1 {1} = {0} {1}3 = 2+ = 2 {2} = {0} {1} {2}, etc. Then the set of natural numbers N is
defined to be the intersection of all the successor sets, where a successor set S is any set that contains ∅ and A + whenever A
belongs to S. Observe how in the successor notation, countable union of singleton integers recursively define the corresponding
sum of integers.
3168     A. Sengupta

                                                                    is the unique matrix representation of the functional
                
                y                                1
                 ,
                
                2                        y ∈ 0,                    inverse a−1 : a(R5 ) → XB extended to Y defined
                                                 2                           B
                                                                    according to18
                
                
                 y 7 3y
                
                                              1 3
                 2, 4 − 2            ,   y∈    ,
                
                                                                                   a−1 (b),
                
                                                                                               if b ∈ R(a)
                
         −                                    2 4                                   B
        f (y) =                                                           g(b) =                                     (29)
                 3, 5 ,
                                            3                                     0,          if b ∈ Y − R(a) ,
                
                 8 8                     y=
                                             4                      that bears comparison with the basic inverse
                
                
                
                
                
                                              3                                      
                                                                                          5     1
                                                                                                         
                 0,                      y∈     ,1 ,
                
                                               4                                       2 −2 0 0 
                
                                                                                                        
                                                                                       0       0 0 0
which shows that f − is multivalued. In order to                                                        
                                                                          A−1 (b∗ ) =  − 3     1
                                                                                                        
avoid cumbersome notations, an injective branch of                         B                        0 0
                                                                                       4       4
                                                                                                        
f will always refer to a representative basic branch                                                     
                                          −1                                           0       0 0 0
                                                                                                        
fB , and its “inverse” will mean either f B or G.
                                                                                          0     0 0 0
Example 2.3 (Revisited). The row reduced                                                          
echelon form of the augmented matrix (A|b) of                                                b1
                                                                                         b
Example 2.3 is
                                                                                                   
                                                                                              2
                                                                                      ×            : a(R5 ) → XB
                                                                                                  
                                                                                         2b1 
                      3 1     5b1 b2
                                       
           1 −3 0                 −                                                       b2 − b 1
                     2 2      2     2 
                                                                    between the two-dimensional column and row
                                       
                     1 3      3b1 b2 
(A|b) →  0    0 1 −         −     +     (27)                      spaces of A which is responsible for the particular
                                       
                     4 4       4     4                            solution of Ax = b. Thus G is simply A−1 acting
                                                                                                            B
        0     0 0    0 0    −2b1 + b3                             on its domain a(X) considered a subspace of Y ,
                0      0   0     0   0    b 1 − b2 + b4             suitably extended to the whole of Y . That it is in-
                                                                    deed a generalized inverse is readily seen through
The multifunctional solution x = A− b, with b any                   the matrix multiplications GAG and AGA that
element of Y = R4 not necessarily in the image of                   can be verified to reproduce G and A, respectively.
a, is                                                               Comparison of Eqs. (12) and (29) shows that the
                           
                               3
                                      
                                            1
                                                                   Moore–Penrose inverse differs from ours through
                           −2        −2                         the geometrical constraints imposed in its defini-
                  
                   3
                 1
                           
                            0
                                      
                                        0
                                                                   tion, Eqs. (13). Of course, this results in a more
                                                              complex inverse (14) as compared to our very simple
x = A− b = Gb+x2  0  +x4  1  +x5  − 3  ,
                                         
                                                                    (28); nevertheless it is true that both the inverses
                            4         4
                                         
                 0                                                satisfy
                            1         0
                                           
                   0                                                                          
                                                                                                1 0 0 0 0
                                                                                                               
                               0            1                                                 0 1 0 0 0
                                                                            E((E(GMP ))T ) = 
                                                                                                              
                                                                                              0 0 0 0 0
                                                                                                               
with its multifunctional character arising from the
arbitrariness of the coefficients x2 , x4 and x5 . The                                            0 0 0 0 0
generalized inverse
                                                                                              = E((E(G))T )
               5     1
                            
            2 −2 0 0                                              where E(A) is the row-reduced echelon form of A.
           
            0
                                                                   The canonical simplicity of Eq. (28) as compared to
                      0 0 0
                                                                  Eq. (14) is a general feature that suggests a more
      G = −3        1
                        0 0  : Y → XB           (28)
                            
                                                                    natural choice of bases by the map a than the or-
            4       4
                            
                                                                   thogonal set imposed by Moore and Penrose. This
            0        0 0 0
                            
                                                                    is to be expected since the MP inverse, governed by
                0     0 0 0                                         Eq. (13), is a subset of our less restricted inverse

18
     See footnote 17 for a justification of the definition when b is not in R(a).
Toward a Theory of Chaos   3169

described by only the first two of (13); more specifi-      the basis that diagonalizes an n × n matrix (when
cally the difference is made clear in Fig. 4(a) which      this is possible) is not the standard “diagonal” or-
shows that for any b ∈ R(A), only GMP (b⊥ ) = 0
                       /                                  thonormal basis of Rn , but a problem-dependent,
as compared to G(b) = 0. This seems to imply              less canonical, basis consisting of the n eigenvectors
that introducing extraneous topological considera-        of the matrix. The 0-rows of the inverse of Eq. (28)
tions into the purely set-theoretic inversion process     result from the three-dimensional null-space vari-
may not be a recommended way of inverting, and            ables x2 , x4 and x5 , while the 0-columns come from
the simple bases comprising the row and null spaces       the two-dimensional image-space dependency of b 3 ,
of A and AT — that are mutually orthogonal just as        b4 on b1 and b2 , that is from the last two zero rows
those of the Moore–Penrose — are a better choice          of the reduced echelon form (27) of the augmented
for the particular problem Ax = b than the gen-           matrix.
eral orthonormal bases that the MP inverse intro-              We will return to this theme of the generation
duces. These “good” bases, with respect to which          of a most appropriate problem-dependent topology
the generalized inverse G has a considerably sim-         for a given space in the more general context of
pler representation, are obtained in a straightfor-       chaos in Sec. 4.2.
ward manner from the row-reduced forms of A and                In concluding this introduction to generalized
AT . These bases are                                      inverses we note that the inverse G of f comes very
                                                          close to being a right inverse: thus even though
(a) The column space of A is spanned by the               AG = 12 its row-reduced form
    columns (1, 3, 2, 2)T and (1, 5, 2, 4)T of A that                                      
    correspond to the basic columns containing the                             1 0 0 0
                                                                             0 1 0 0
    leading 1’s in the row-reduced form of A,                                              
(b) The null space of AT is spanned by the solu-                             0 0 0 0
                                                                                           
    tions (−2, 0, 1, 0)T and (1, −1, 0, 1)T of the                             0 0 0 0
    equation AT b = 0,
(c) The row space of A is spanned by the rows             is to be compared     with the corresponding less
    (1, −3, 2, 1, 2) and (3, −9, 10, 2, 9) of A cor-      satisfactory
    responding to the non-zero rows in the row-                                          −1
                                                                                           
                                                                          1     0    2
    reduced form of A,                                                 0
                                                                               1    0    1
(d) The null space of A is spanned by the                              0       0    0    0
                                                                                           
    solutions (3, 1, 0, 0, 0), (−6, 0, 1, 4, 0), and
    (−2, 0, −3, 0, 4) of the equation Ax = 0.                             0     0    0    0
                                                          representation of AGMP .
     The main differences between the natural
“good” bases and the MP-bases that are respon-
sible for the difference in the form of inverses, is       3. Multifunctional Extension of
that the latter have the additional restrictions of          Function Spaces
being orthogonal to each other (recall the orthog-        The previous section has considered the solution of
onality property of the Q-matrices), and the more         ill-posed problems as multifunctions and has shown
severe of basis vectors mapping onto basis vectors        how this solution may be constructed. Here we in-
according to Axi = σi bi , i = 1, . . . , r, where the    troduce the multifunction space Multi | (X) as the
{xi }i=1 and {bj }j=1 are the eigenvectors of AT A
     n             m
                                                          first step toward obtaining a smallest dense ex-
and AAT respectively and (σi )r are the positive
                                 i=1                      tension Multi(X) of the function space Map(X).
square roots of the non-zero eigenvalues of A T A (or     Multi| (X) is basic to our theory of chaos [Sengupta
of AAT ), with r denoting the dimension of the row         Ray, 2000] in the sense that a chaotic state of
or column space. This is considered as a serious re-      a system can be fully described by such an inde-
striction as the linear combination of the basis {b j }   terminate multifunctional state. In fact, multifunc-
that Axi should otherwise have been equal to, al-         tions also enter in a natural way in describing the
lows a greater flexibility in the matrix representa-       spectrum of nonlinear functions that we consider in
tion of the inverse that shows up in the structure of     Sec. 6; this is required to complete the construc-
G. These are, in fact, quite general considerations in    tion of the smallest extension Multi(X) of the func-
the matrix representation of linear operators; thus       tion space Map(X). The main tool in obtaining the
3170     A. Sengupta

space Multi| (X) from Map(X) is a generalization                   of (fα )α∈D converge pointwise in Y . Explicitly, this
of the technique of pointwise convergence of con-                  is the subset of Y on which subnets of injective
tinuous functions to (discontinuous) functions. In                 branches of (fα )α∈D in Map(Y, X) combine to form
the analysis below, we consider nets instead of se-                a net of functions that converge pointwise to a fam-
quences as the spaces concerned, like the topology                 ily of limit functions G : R− → X. Depending on
of pointwise convergence, may not be first count-                   the nature of (fα )α∈D , there may be more than one
able, Appendix A.1.                                                R− with a corresponding family of limit functions
                                                                   on each of them. To simplify the notation, we will
                                                                   usually let G : R− → X denote all the limit func-
3.1. Graphical convergence of a net
                                                                   tions on all the sets R− .
     of functions                                                       If we consider cofinal rather than residual sub-
Let (X, U) and (Y, V) be Hausdorff spaces and                       sets of D then corresponding D+ and R+ can be
(fα )α∈D : X → Y be a net of piecewise continuous                  expressed as
functions, not necessarily with the same domain or
range, and suppose that for each α ∈ D there is                        D+ = {x ∈ X : ((fν (x))ν∈Cof(D) converges in
                                              −
a finite set Iα = {1, 2, . . . Pα } such that fα has Pα                      (Y, V))}                              (32)
functional branches possibly with different domains;
                                                                       R+ = {y ∈ Y : (∃i ∈ Iν )((gνi (y))ν∈Cof(D)
obviously Iα is a singleton iff f is a injective. For
each α ∈ D, define functions (gαi )i∈Iα : Y → X                              converges in (X, U))} .               (33)
such that
                                                                   It is to be noted that the conditions D + = D− and
                          I
             fα gαi fα = fαi      i = 1, 2, . . . Pα ,             R+ = R− are necessary and sufficient for the Kura-
          I                                                        towski convergence to exist. Since D + and R+ differ
where fαi is a basic injective branch of fα on
                                 I              I                  from D− and R− only in having cofinal subsets of D
some subset of its domain: gαi fαi = 1X on D(fαi ),
  I g                                                              replaced by residual ones, and since residual sets are
fαi αi = 1Y on D(gαi ) for each i ∈ Iα . The use of
                                                                   also cofinal, it follows that D− ⊆ D+ and R− ⊆ R+ .
nets and filters is dictated by the fact that we do
                                                                   The sets D− and R− serve for the convergence of
not assume X and Y to be first countable. In the
                                                                   a net of functions just as D+ and R+ are for the
application to the theory of dynamical systems that
                                                                   convergence of subnets of the nets (adherence). The
follows, X and Y are compact subsets of R when the
                                                                   latter sets are needed when subsequences are to be
use of sequences suffice.
                                                                   considered as sequences in their own right as, for
      In terms of the residual and cofinal subsets
                                                                   example, in dynamical systems theory in the case
Res(D) and Cof(D) of a directed set D (Defini-
                                                                   of ω-limit sets.
tion A.1.7), with x and y in the equations below
                                                                        As an illustration of these definitions, consider
being taken to belong to the required domains, de-
                                                                   the sequence of injective functions on the interval
fine subsets D− of X and R− of Y as
                                                                   [0, 1] fn (x) = 2n x, for x ∈ [0, 1/2n ], n = 0, 1, 2 . . . .
 D− = {x ∈ X : ((fν (x))ν∈D converges in (Y, V))}                  Then D0.2 is the set {0, 1, 2} and only D0 is even-
                                                (30)               tual in D. Hence D− is the single point set {0}. On
                                                                   the other hand Dy is eventual in D for all y and R−
 R− = {y ∈ Y : (∃i ∈ Iν )((gνi (y))ν∈D converges in
                                                                   is [0, 1].
      (X, U))}                                  (31)
Thus,                                                              Definition 3.1 (Graphical Convergence of a net of
     D− is the set of points of X on which the values              functions). A net of functions (fα )α∈D : (X, U) →
of a given net of functions (fα )α∈D converge point-               (Y, V) is said to converge graphically if either D − =
wise in Y . Explicitly, this is the subset of X on                 ∅ or R− = ∅; in this case let F : D− → Y and
which subnets19 in Map(X, Y ) combine to form a                    G : R− → X be the entire collection of limit func-
net of functions that converge pointwise to a limit                tions. Because of the assumed Hausdorffness of X
function F : D− → Y .                                              and Y , these limits are well defined.
     R− is the set of points of Y on which the values                   The graph of the graphical limit M of the net
                                                                                                             G
of the nets in X generated by the injective branches               (fα ) : (X, U) → (Y, V) denoted by fα → M, is the

19
     A subnet is the generalized uncountable equivalent of a subsequence; for the technical definition, see Appendix A.1.
Toward a Theory of Chaos   3171

subset of D− × R− that is the union of the graphs          spaces X and Y to be a consequence of both the di-
of the function F and the multifunction G −                rect interaction represented by f : X → Y and also
                GM = G F        G G−                       the inverse interaction f − : Y –→ X, and our formu-
                                                                                            →
                                                           lation of pointwise biconvergence is a formalization
where
                                                           of this idea. Thus the basic examples (1) and (2)
GG− = {(x, y) ∈ X × Y : (y, x) ∈ GG ⊆ Y × X}.              below produce multifunctions instead of discontin-
                                                           uous functions that would be obtained by the usual
                                                           pointwise limit.
Begin Tutorial 6: Graphical
Convergence                                                Example 3.1
The following two examples are basic to the un-
derstanding of the graphical convergence of func-          (1)           
                                                                          0, −1 ≤ x ≤ 0
tions to multifunctions and were the examples                            
                                                                         
                                                                         
that motivated our search of an acceptable tech-                         
                                                                                     1
nique that did not require vertical portions of                  fn (x) = nx, 0 ≤ x ≤    : [−1, 1] → [0, 1]
                                                                                     n
limit relations to disappear simply because they                         
                                                                         
                                                                             1
were non-functions: the disturbing question that
                                                                          1,
                                                                               ≤x≤1
                                                                              n
needed an answer was how not to mathemati-
cally sacrifice these extremely significant physi-                                    y               1
cal components of the limiting correspondences.                          gn (y) =     : [0, 1] → 0,
                                                                                    n               n
Furthermore, it appears to be quite plausible
to expect a physical interaction between two                     Then


                                   0,     −1 ≤ x ≤ 0
                        F (x) =                         on D− = D+ = [−1, 0]        (0, 1]
                                   1,     0x≤1
                                        G(y) = 0   on   R− = [0, 1] = R+ .


    The graphical limit is ([−1, 0], 0) (0, [0, 1])        not converge graphically because in this case both
    ((0, 1], 1).                                           the sets D− and R− are empty. The power of
(2) fn (x) = nx for x ∈ [0, 1/n] gives gn (y) = y/n :      graphical convergence in capturing multifunctional
    [0, 1] → [0, 1/n]. Then                                limits is further demonstrated by the example of
                                                                                   ∞
                                                           the sequence (sin nπx)n=1 that converges to 0 both
            F (x) = 0 on D− = {0} = D+ ,                   1-integrally and test-functionally, Eqs. (3) and (4).
                                                                It is necessary to understand how the concepts
           G(y) = 0 on R− = [0, 1] = R+ .                  of eventually in and frequently in of Appendix A.2
                                                           apply in examples (a) and (b) of Fig. 7. In these
    The graphical limit is (0, [0, 1]).
                                                           two examples we have two subsequences one each
In these examples that we consider to be the proto-        for the even indices and the other for the odd.
types of graphical convergence of functions to mul-        For a point-to-point functional relation, this would
tifunctions, G(y) = 0 on R− because gn (y) → 0             mean that the sequence frequents the adherence set
for all y ∈ R− . Compare the graphical multifunc-          adh(x) of the sequence (xn ) but does not converge
tional limits with the corresponding usual pointwise       anywhere as it is not eventually in every neigh-
functional limits characterized by discontinuity at        borhood of any point. For a multifunctional limit
x = 0. Two more examples from Sengupta and Ray             however it is possible, as demonstrated by these
[2000] that illustrate this new convergence princi-        examples, for the subsequences to be eventually
ple tailored specifically to capture one-to-many re-        in every neighborhood of certain subsets common
lations are shown in Fig. 7 which also provides an         to the eventual limiting sets of the subsequences;
example in Fig. 7(c) of a function whose iterates do       this intersection of the subsequential limits is now
3172      A. Sengupta


                                            11+ 1/n + 1/n                      1/n 1/n 2 − 1/n 1/n
                                                                                1/n 1/n 1/n −
                                                                                      2 − 2 − 1/n
                                                                                            2
                      1.5 1.5                 +1 + 1/n
                                                  1
                                                 1/n                                             n n even even
                                                                                                   eveneven
                       1.5 1.5                                                                       n n
                                                         nneveneven
                                                             nn
                                                            even even
                          1      1 1
                           1
                                                             1 + 2/n + 2/n
                                                              1 +12/n
                                                                   + 2/n
                                                                   1

                                                                                                              3 + 1/n 1/n
                                                                                                                   3++
                         0                                                                                      3 + 1/n 1/n
                                                                                                                     3
                          0     0 0
                                            1      1 1   2 2 2                        1        2     3
                                             1            2                               1 1 1 2 2 2 3 3 3
                                                           n odd odd
                                                               nn
                     -0.5 -0.5
                      -0.5 -0.5                             n odd odd                                n odd odd
                                                                                                         n n
                                                                                                      n odd odd
                                            (a) (a)                                         (b) (b)
                                             (a) (a)
                                             (a)                                                  (b)
                                                                                             (b) (b)
                    1         1 1
                     1
                               12 iterates of −0.05 + x − x2− −2x2
                                        iterates of −0.05 + x x
                                    1212 iterates of                  12 iterates of 0.7 + x + x2+ x2 x2
                                                                           12 iterates of 0.7
                                12 iterates of −0.05−0.05 + 2
                                                      +x−x             12 12 iterates of 0.7 + x +
                                                                 1 1 1 iterates of 0.7 + x + x
                                                                                                  2
                                       0 0 0                     α 1 αα
                                        0                          α

                                                              1 1 1
                    -1 -1 -1                                   1
                     -1                                                                     0       0 0
                                             12 1212                                            0
                                              12
                    -2 -2 -2
                     -2                                               -1 -1 -1
                                                                       -1
                    -3 -3 -3
                     -3 -1 -1
                     -1                0    0 0      1 1 1         2 2 2      a       a a                       c       c c
                      -1                0             1             2             a                                 c
                                            (c) (c)
                                            (c)   (c)                                       (d) (d)
                                                                                                  (d)
                                             (c)                                             (d) (d)
                                                         1    for 0 ≤ x ≤ 1
Fig. 7.       The graphical limits are: (a) F (x) =                         on D− = [0, 1]          (1, 2], and G(y) = 1 on R− = [0, 1]. Also
                                                         0    for 1  x ≤ 2
          1     on R+ = [0, 3/2]
G=                                .
          1     on R+ = [−1/2, 1]
(b) F (x) = 1 on D− = {0} and G(y) = 0 on R− = {1}. Also F (x) = −1/2, 0, 1, 3/2 respectively on D+ = (0, 3], {2},
{0}, (0, 2) and G(y) = 0, 0, 2, 3 respectively on R+ = (−1/2, 1], [1, 3/2), [0, 3/2), [−1/2, 0).
(c) For f (x) = −0.05 + x − x2 , no graphical limit as D− = ∅ = R− .
(d) For f (x) = 0.7 + x − x2 , F (x) = α on D− = [a, c], G1 (y) = a and G2 (y) = c on R− = (−∞, α]. Notice how the two fixed
points and their equivalent images define the converged limit rectangular multi. As in example (1) one has D − = D+ ; also
R− = R + .


defined to be the limit of the original sequence. A                        ous set of equations (sequence) may have distinct
similar situation obtains, for example, in the solu-                      solutions (limits), the solution of the equations is
tion of simultaneous equations: The solution of the                       their common point of intersection.
equation a11 x1 + a12 x2 = b1 for one of the vari- 1
                                              1   1                            Considered as sets in X × Y , the discussion of
ables x2 say with a12 = 0, is the set represented
                                               1                          convergence of a sequence of graphs f n : X → Y
by the straight line x2 = m1 x1 + c1 for all x1 in                        would be incomplete without a mention of the con-
its domain, while for a different set of constants                         vergence of a sequence of sets under the Hausdorff
a21 , a22 and b2 the solution is the entirely differ-                      metric that is so basic in the study of fractals. In
ent set x2 = m2 x1 + c2 , under the assumption that                       this case, one talks about the convergence of a se-
m1 = m2 and c1 = c2 . Thus even though the indi-                          quence of compact subsets of the metric space R n
vidual equations (subsequences) of the simultane-                         so that the sequences, as also the limit points that
Toward a Theory of Chaos    3173

are the fractals, are compact subsets of R n . Let K          topology of pointwise convergence iff (f α ) converges
denote the collection of all nonempty compact sub-            pointwise to f in the sense that fα (x) → f (x) in Y
sets of Rn . Then the Hausdorff metric dH between              for every x in X.
two sets on K is defined to be                                 Proof.     Necessity. First consider fα → f in
dH (E, F ) = max{δ(E, F ), δ(F, E)}              E, F ∈ K ,   (Map(X, Y ), T ). For an open neighborhood V of
                                                              f (x) in Y with x ∈ X, let B(x; V ) be a local neigh-
where                                                         borhood of f in (Map(X, Y ), T ), see Eq. (A.6) in
            δ(E, F ) = max min x − y         2                Appendix A.1. By assumption of convergence, (f α )
                        x∈E y∈F
                                                              must eventually be in B(x; V ) implying that f α (x)
is δ(E, F ) is the non-symmetric 2-norm in R n .              is eventually in V . Hence fα (x) → f (x) in Y .
The power and utility of the Hausdorff distance is                   Sufficiency. Conversely, if fα (x) → f (x) in
best understood in terms of the dilations E + ε :=            Y for every x ∈ X, then for a finite collection
                                   n                          of points (xi )I of X (X may itself be uncount-
   x∈E Dε (x) of a subset E of R by ε where Dε (x)                              i=1
is a closed ball of radius ε at x; physically a dilation      able) and corresponding open sets (V i )I in Y with
                                                                                                         i=1
of E by ε is a closed ε-neighborhood of E. Then a             f (xi ) ∈ Vi , let B((xi )I ; (Vi )I ) be an open neigh-
                                                                                        i=1      i=1
fundamental property of dH is that dH (E, F ) ≤ ε             borhood of f . From the assumed pointwise conver-
iff both E ⊆ F + ε and F ⊆ E + ε hold simultane-               gence fα (xi ) → f (xi ) in Y for i = 1, 2, . . . , I, it
ously which leads [Falconer, 1990] to the interesting         follows that (fα (xi )) is eventually in Vi for every
consequence that                                              (xi )I . Because D is a directed set, the existence of
                                                                   i=1
             ∞
     If (Fn )n=1 and F are nonempty compact sets,             a residual applicable globally for all i = 1, 2, . . . , I
then limn→∞ Fn = F in the Hausdorff metric iff                  is assured leading to the conclusion that f α (xi ) ∈ Vi
Fn ⊆ F +ε and F ⊆ Fn +ε eventually. Furthermore               eventually for every i = 1, 2, . . . , I. Hence f α ∈
        ∞
if (Fn )n=1 is a decreasing sequence of elements of a         B((xi )I ; (Vi )I ) eventually; this completes the
                                                                      i=1        i=1
filter-base in Rn , then the nonempty and compact              demonstration that fα → f in (Map(X, Y ), T ),
limit set F is given by                                       and thus of the proof.
                                  ∞
                lim Fn = F =          Fn .                    End Tutorial 6
               n→∞
                                n=1
Note that since  Rn  is Hausdorff, the assumed com-
pactness of Fn ensures that they are also closed in           3.2. The extension Multi| (X, Y ) of
Rn ; F , therefore, is just the adherent set of the                Map (X, Y )
filter-base. In the deterministic algorithm for the            In this section we show how the topological treat-
generation of fractals by the so-called iterated func-        ment of pointwise convergence of functions to func-
tion system (IFS) approach, Fn is the inverse im-             tions given in Example A.1.1 of Appendix 1 can be
age by the nth iterate of a non-injective function f          generalized to generate the boundary Multi | (X, Y )
having a finite number of injective branches and               between Map(X, Y ) and Multi(X, Y ); here X
converging graphically to a multifunction. Under              and Y are Hausdorff spaces and Map(X, Y ) and
the conditions stated above, the Hausdorff metric              Multi(X, Y ) are respectively the sets of all func-
ensures convergence of any class of compact sub-              tional and non-functional relations between X and
sets in Rn . It appears eminently plausible that our          Y . The generalization we seek defines neighbor-
multifunctional graphical convergence on Map(R n )            hoods of f ∈ Map(X, Y ) to consist of those func-
implies Hausdorff convergence on Rn : in fact point-           tional relations in Multi(X, Y ) whose images at any
wise biconvergence involves simultaneous conver-              point x ∈ X lies not only arbitrarily close to f (x)
gence of image and preimage nets on Y and X,                  (this generates the usual topology of pointwise con-
respectively. Thus confining ourselves to the sim-             vergence TY of Example A.1.1) but whose inverse
pler case of pointwise convergence, if (f α )α∈D is a         images at y = f (x) ∈ Y contain points arbitrar-
net of functions in Map(X, Y ), then the following            ily close to x. Thus the graph of f must not only
theorem expresses the link between convergence in             lie close enough to f (x) at x in V , but must addi-
Map(X, Y ) and in Y .                                         tionally be such that f − (y) has at least branch in
Theorem 3.1. A net of functions (fα )α∈D con-                 U about x; thus f is constrained to cling to f as
verges to a function f in (Map(X, Y ), T ) in the             the number of points on the graph of f increases
3174     A. Sengupta

                             ©
                                           ©
                                                                        ©
                                                                                                          %
                                                                                                           !              # $              ! 
                                                                                                   !
                                
                                                                                          9¥' 75
                                                                                          83 6
                                                                                                                                              C
                                                                                          @1' 75
                                                                                          80 6
                                                                                          @1' 75
                                                                                          84 6
                                                                                         BA' 75
                                                                                          8( 6

                            ¢ £¡        ¤ ¥¡         ¦ ¥¡       § ¨¡                                       ¨21' ¥' )'
                                                                                                            3 ' 0 4 (                             !
                                                (a)                                                                        (b)

Fig. 8. The power of graphical convergence, illustrated for Example 3.1 (1), shows a local neighborhood of the functions
x and 2x in (a) and (b) at the four points (xi )4 with corresponding neighborhoods (Ui )4 and (Vi )i=1 at (xi , f (xi )) in
                                                  i=1                                         i=1
                                                                                                            4

R in the X and Y directions respectively, see Eqs. (34) and (A.6) for the notations. (a) shows a function g in a pointwise
neighborhood of f determined by the open sets Vi , while (b) shows g in a graphical neighborhood of f due to both Ui and
Vi . A comparison of these figures demonstrates how the graphical neighborhood forces functions close to f remain closer to it
than if they were in its pointwise neighborhood. This property is clearly visible in (a) where g, if it were to be in a graphical
neighborhood of f , would be more faithful to it by having to be also in U2 and U4 . Thus in this case not only must the images
          j                                                                           j
f (xij ) → f (xi ) as Vi decreases, but also the preimages xij → xi with shrinking Ui . It is this simultaneous convergence of
both images and preimages at every x that makes graphical convergence a natural candidate for multifunctional convergence
of functions.



with convergence and, unlike in the situation of sim-                                     for every choice of α ∈ D, is a base T B of
ple pointwise convergence, no gaps in the graph of                                        (Map(X, Y ), T ). Here the directed set D is used
the limit object is permitted not only, as in Exam-                                       as an indexing tool because, as pointed out in Ex-
ple A.1.1 on the domain of f , but simultaneously                                         ample A.1.1, the topology of pointwise convergence
on its range too. We call the resulting generated                                         is not first countable.
topology the topology of pointwise biconvergence on                                            In a manner similar to Eq. (34), the open sets
Map(X, Y ), to be denoted by T . Thus for any given                                                          ˆ
                                                                                          of (Multi(X, Y ), T ), where Multi(X, Y ) are mul-
integer I ≥ 1, the generalization of Eq. (A.6) gives                                      tifunctions with only countably many values in Y
for i = 1, 2, . . . , I, the open sets of (Map(X, Y ), T )                                for every point of X (so that we exclude continuous
to be                                                                                     regions from our discussion except for the “vertical
                                                                                          lines” of Multi| (X, Y )), can be defined as
      B((xi ), (Vi ); (yi ), (Ui ))
              = {g ∈ Map(X, Y ) : (g(xi ) ∈ Vi )                                                       ˆ
                                                                                                       B((xi ), (Vi ); (yi ), (Ui ))

              ∧ (g − (yi )          Ui = ∅), i = 1, 2, . . . , I} , (34)                                    = {G ∈ Multi(X, Y ) : (G(xi )             Vi = ∅)
                                                                                                            ∧ (G − (yi )         Ui = ∅)} ,                (36)
where (xi )i=1 , (Vi )I
            I
                      i=1 are as in that example,
     I
(yi )i=1 ∈ Y , and the corresponding open sets                                            where
(Ui )i=1 in X are chosen arbitrarily.20 A local base
      I

at f , for (xi , yi ) ∈ Gf , is the set of functions                                                           G − (y) = {x ∈ X : y ∈ G(x)} .
of (34) with yi = f (xi ) and the collection of all
                                                                                          and (xi )I                  I         I
                                                                                                    i=1 ∈ D(M), (Vi )i=1 ; (yi )i=1 ∈ R(M),
local bases                                                                                    I
                                                                                          (Ui )i=1 are chosen as in the above. The topology
     Bα = B((xi )i=1 , (Vi )Iα ; (yi )Iα , (Ui )Iα ) ,
                 Iα
                                                                               (35)       ˆ
                                                                                          T of Multi(X, Y ) is generated by the collection of
                            i=1       i=1       i=1

20
     Equation (34) is essentially the intersection of the pointwise topologies (A.6) due to f and f − .
Toward a Theory of Chaos   3175

                 ˆ
all local bases Bα for every choice of α ∈ D, and it is    persets of all elements of F B; see Appendix A.1) and
not difficult to see from Eqs. (34) and (36), that the       thereby the filter-base
             ˆ               ˆ
restriction T |Map(X, Y ) of T to Map(X, Y ) is just T .              ˆ = {B = B
                                                                           ˆ
                                                                     FB                 {m} : B ∈F B}
                                                                                         ˆ
                   ˆ
     Henceforth T and T will be denoted by the
                                                                ˆ
                                                           on M ; this filter-base at m can also be obtained
same symbol T , and convergence in the topology
of pointwise biconvergence in (Multi(X, Y ), T ) will      independently from Eq. (36). Obviously Fˆ is an
                                                                                                       B
                                                           extension of F B on Mˆ and F B is the filter induced
be denoted by , with the notation being derived
from Theorem 3.1.                                          on M by Fˆ We may also consider the filter-base
                                                                       B.
                                                           to be a topological base on M that defines a coarser
Definition 3.2 (Functionization of a multifunction).        topology T on M (through all unions of members
A net of functions (fα )α∈D in Map(X, Y ) converges        of F B) and hence the topology
in (Multi(X, Y ), T ), fα       M, if it biconverges                  ˆ    ˆ
                                                                      T = {G = G       {m} : G ∈ T }
                                                                                        ˆ
pointwise in (Map(X, Y ), T   ∗ ). Such a net of func-

tions will be said to be a functionization of M.                 ˆ                                        ˆ
                                                           on M to be the topology associated with F. A finer
                                                           topology on M   ˆ may be obtained by adding to T       ˆ
Theorem 3.2. Let (fα )α∈D be a net of functions in         all the discarded elements of T0 that do not satisfy
Map(X, Y ). Then                                           FIP. It is clear that m is on the boundary of M
                                                                                    ˆ
                                                           because every neighborhood of m intersects M by
                                                                                                 ˆ
                   G                                                                                   ˆ ˆ
               fα → M ⇔ f α        M.                      construction; thus (M, T ) is dense in ( M, T ) which
                                                           is the required topological extension of (M, T ).
Proof. If (fα ) converges graphically to M then ei-              In the present case, a filter-base at f ∈
ther D− or R− is non-empty; let us assume both of          Map(X, Y ) is the neighborhood system F Bf at f
them to be so. Then the sequence of functions (f α )       given by decreasing sequences of neighborhoods
converges pointwise to a function F on D − and to          (Vk ) and (Uk ) of f (x) and x, respectively, and the
functions G on R− , and the local basic neighbor-                   ˆ
                                                           filter F is the neighborhood filter Nf G where
hoods of F and G generate the topology of point-           G ∈ Multi| (X, Y ). We shall present an alternate,
wise biconvergence.                                        and perhaps more intuitively appealing, description
    Conversely, for pointwise biconvergence on X           of graphical convergence based on the adherence set
and Y , R− and D− must be non-empty.                       of a filter in Sec. 4.1.
                                                                 As more serious examples of the graphical con-
                                                           vergence of a net of functions to multifunction than
    Observe that the boundary of Map(X, Y ) in
                                                           those considered above, Fig. 9 shows the first four
the topology of pointwise biconvergence is a “line
                                                           iterates of the tent map
parallel to the Y -axis”. We denote this closure of                   
Map(X, Y ) as                                                                                    1
                                                                       2x,           0≤x
                                                                      
                                                                                                 2
                                                                      
                                                               t(x) =                                  (t1 = t) .
Definition 3.3. Multi| ((X, Y ), T ) = Cl(Map((X,                       2(1 − x), 1 ≤ x ≤ 1
                                                                      
Y ), T )).
                                                                      
                                                                                       2
                                                           defined on [0.1] and the sine map fn =
     The sense in which Multi| (X, Y ) is the smallest
                                                           | sin(2n−1 πx)|, n = 1, . . . , 4 with domain [0, 1].
closed topological extension of M = Map(X, Y ) is
                                                                 These examples illustrate the important gener-
the following, refer to Theorems A.1.4 and its proof.      alization that periodic points may be replaced by the
Let (M, T0 ) be a topological space and suppose that       more general equivalence classes where a sequence
                   ˆ                                       of functions converges graphically; this generaliza-
                   M =M        {m}
                                ˆ
                                                           tion based on the ill-posed interpretation of dynam-
is obtained by adjoining an extra point to M ; here        ical systems is significant for non-iterative systems
M = Map(X, Y ) and m ∈ Cl(M ) is the multifunc-
                       ˆ                                   as in second example above. The equivalence classes
                 ˆ
tional limit in M = Multi| (X, Y ). Treat all open         of the tent map for its two fixed points 0 and 2/3
sets of M generated by local bases of the type (35)        generated by the first four iterates are
with finite intersection property as a filter-base F B                             1 1 3 1 5 3 7
                                                                   [0]4 =   0,    , , , , , , ,1
on X that induces a filter F on M (by forming su-                                 8 4 8 2 8 4 8
3176    A. Sengupta

             1            1                                                      1     1




             0                    iterates of tent map       1     Graph Graph of |sine|4maps maps
                          0First 4First 4 iterates of tent map 0 1 0     of first 4 first |sine|   1                                    1
                                                 (a)
                                                (a)          (a)                                       (b)     (b)
                                                                                                              (b)

Fig. 9. The first four iterates of (a) tent and (b) | sin(2n−1 πx)| maps show the formal similarity of the dynamics of these
functions. It should be noted, as shown in Fig. 7, that although sin(nπx)∞ fails to converge at any point other than 0 and
                                                                         n=1
1, the subsequence sin(2n−1 πx)∞ does converge graphically on a set dense in [0, 1].
                                n=1


    2                     1             1           3        1          5            functions. It is to be noted that the number of equiv-
            =        c,            c,          c,       c,         c,       c,       alent fixed points in a class increases with the num-
    3   4                 8             4           8        2          8
                                                                                     ber of iterations k as 2k−1 + 1; this increase in the
                 3             7
                          c,            c, 1 − c                                     degree of ill-posedness is typical of discrete chaotic
                 4             8                                                     systems and can be regarded as a paradigm of chaos
where c = 1/24. If the moduli of the slopes of the                                   generated by the convergence of a family of func-
graphs passing through these equivalent fixed points                                  tions.
are greater than 1 then the graphs converge to                                           The mth iterate tm of the tent map has 2m fixed
multifunctions and when these slopes are less than                                   points corresponding to the 2m injective branches
1 the corresponding graphs converge to constant                                      of tm

                                 j−1
                                
                                 m      ,              j = 1, 3, . . . , (2m − 1)
                                  2 −1
                 xmj          =                                                            tm (xmj ) = xmj , j = 1, 2, . . . , 2m .
                                 j
                                        ,              j = 2, 4, . . . ,   2m
                                  2m + 1
Let Xm be the collection of these 2m fixed points                                     higher iterates tn for m = in with i = 1, 2, . . .
(thus X1 = {0, 2/3}), and denote by [Xm ] the set                                    where these subsequences remain fixed. For exam-
of the equivalent points, one coming from each of                                    ple, the fixed points 2/5 and 4/5 produced respec-
the injective branches, for each of the fixed points:                                 tively by the second and fourth injective branches
thus                                                                                 of t2 , are also fixed for the seventh and thirteenth
                                                                   1        1        branches of t4 . For the shift map 2xmod(1) on [0, 1],
                                                    2
            D− = [X1 ] =                [0],                                         D− = {[0], [1]} where [0] = ∞ {(i − 1)/2m :
                                                                                                                         m=1
                                                    3                                                                    ∞
                                                                                     i = 1, 2, . . . , 2m } and [1] = m=1 {i/2m : i =
                                                    2   2   4                        1, 2, . . . , 2m }.
                      [X2 ] =           [0],          ,   ,                               It is useful to compare the graphical conver-
                                                    5   3   5
                                                                                     gence of (sin(πnx))∞ to [0, 1] at 0 and to 0 at 1
                                                                                                            n=1
and D+ = ∞ [Xm ] is a non-empty countable
               m=1                                                                   with the usual integral and test-functional conver-
set dense in X at each of which the graphs of the                                    gences to 0; note that the point 1/2, for example,
sequence (tm ) converge to a multifunction. New                                      belongs to D+ and not to D− = {0, 1} because it
sets [Xn ] will be formed by subsequences of the                                     is frequented by even n only. However for the sub-
Toward a Theory of Chaos   3177

sequence (f2m−1 )m∈Z+ , 1/2 is in D− because if the      properties (ER1)–(ER3) of an equivalence relation,
graph of f2m−1 passes through (1/2, 0) for some m,       Tutorial 1)
then so do the graphs for all higher values. There-
fore [0] = ∞ {i/2m−1 : i = 0, 1, . . . , 2m−1 } is the
            m=1                                          (OR1) Reflexive, that is (∀x ∈ X)(x x).
equivalence class of (f2m−1 )∞ and this sequence
                             m=1                         (OR2) Antisymmetric: (∀x, y ∈ X)(x         y∧y
converges to [−1, 1] on this set. Thus our extension     x ⇒ x = y).
Multi(X) is distinct from the distributional exten-
                                                         (OR3) Transitive, that is (∀x, y, z ∈ X)(x y ∧y
sion of function spaces with respect to test func-
                                                         z ⇒ x z). Any notion of order on a set X in the
tions, and is able to correctly generate the patho-
                                                         sense of one element of X preceding another should
logical behavior of the limits that are so crucially
                                                         possess at least this property.
vital in producing chaos.
                                                         The relation is a preorder       if it is only reflexive
4. Discrete Chaotic Systems are                          and transitive, that is if only (OR1) and (OR3) are
   Maximally Ill-posed                                   true. If the hypothesis of (OR2) is also satisfied by
                                                         a preorder, then this      induces an equivalence re-
The above ideas apply to the development of a cri-       lation ∼ on X according to (x          y) ∧ (y    x) ⇔
terion for chaos in discrete dynamical systems that      x ∼ y that evidently is actually a partial order iff
is based on the limiting behavior of the graphs of       x ∼ y ⇔ x = y. For any element [x] ∈ X/ ∼ of the
a sequence of functions (fn ) on X, rather than on       induced quotient space, let ≤ denote the generated
the values that the sequence generates as is cus-        order in X/ ∼ so that
tomary. For the development of the maximality of
ill-posedness criterion of chaos, we need to refresh                      x    y ⇔ [x] ≤ [y] ;
ourselves with the following preliminaries.
                                                         then ≤ is a partial order on X/ ∼. If every two ele-
                                                         ments of X are comparable, in the sense that either
                                                         x1    x2 or x2    x1 for all x1 , x2 ∈ X, then X is
                                                         said to be a totally ordered set or a chain. A to-
Resume Tutorial 5: Axiom of Choice                       tally ordered subset (C, ) of a partially ordered
and Zorn’s Lemma                                         set (X, ) with the ordering induced from X, is
Let us recall from the first part of this Tutorial that   known as a chain in X if
for nonempty subsets (Aα )α∈D of a nonempty set             C = {x ∈ X : (∀c ∈ X)(c       x∨x      c)} .    (37)
X, the Axiom of Choice ensures the existence of a
set A such that A Aα consists of a single element        The most important class of chains that we are con-
for every α. The choice axiom has far reaching con-      cerned with in this work is that on the subsets P(X)
sequences and a few equivalent statements, one of        of a set (X, ⊆) under the inclusion order; Eq. (37),
which the Zorn’s lemma that will be used immedi-         as we shall see in what follows, defines a family of
ately in the following, is the topic of this resumed     chains of nested subsets in P(X). Thus while the
Tutorial. The beauty of the Axiom, and of its equiv-     relation    in Z defined by n1       n2 ⇔ |n1 | ≤ |n2 |
alents, is that they assert the existence of mathe-      with n1 , n2 ∈ Z preorders Z, it is not a partial
matical objects that, in general, cannot be demon-       order because although −n         n and n     −n for
strated and it is often believed that Zorn’s lemma       any n ∈ Z, it is does not follow that −n = n.
is one of the most powerful tools that a mathemati-      A common example of partial order on a set of
cian has available to him that is “almost indispens-     sets, for example on the power set P(X) of a set
able in many parts of modern pure mathematics”           X (see footnote 23), is the inclusion relation ⊆: the
with significant applications in nearly all branches      ordered set X = (P({x, y, z}), ⊆) is partially or-
of contemporary mathematics. This “lemma” talks          dered but not totally ordered because, for exam-
about maximal (as distinct from “maximum”) ele-          ple, {x, y} ⊆ {y, x}, or {x} is not comparable to
ments of a partially ordered set, a set in which some    {y} unless x = y; however C = {{∅, {x}, {x, y}}
notion of x1 “preceding” x2 for two elements of the      does represent one of the many possible chains of
set has been defined.                                     X . Another useful example of partial order is the
     A relation    on a set X is said to be a partial    following: Let X and (Y, ≤) be sets with ≤ or-
order (or simply an order) if it is (compare with the    dering Y , and consider f, g ∈ Map(X, Y ) with
3178    A. Sengupta

D(f ), D(g) ⊆ X. Then                                               which requires the upper bound u to be larger than
                                                                    all members of A, with the corresponding lower
       (D(f ) ⊆ D(g))(f = g|D(f ) ) ⇔ f          g
                                                                    bounds of A being defined in a similar manner. Of
   (D(f ) = D(g))(R(f ) ⊆ R(g)) ⇔ f                  g       (38)   course, it is again not necessary that the elements
 (∀x ∈ D(f ) = D(g))(f (x) ≤ g(x)) ⇔ f                   g          of A be comparable to each other, and it should
                                                                    be clear from Eqs. (41) and (42) that when an up-
define partial orders on Map(X, Y ). In the last case,               per bound of a set is in the set itself, then it is the
the order is not total because any two functions                    maximum element of the set. If the upper (lower)
whose graphs cross at some point in their common                    bounds of a subset (A, ) of a set (X, ) has a least
domain cannot be ordered by the given relation,                     (greatest) element, then this smallest upper bound
while in the first any f whose graph does not coin-                  (largest lower bound) is called the least upper bound
cide with that of g on the common domain is not                     (greatest lower bound) or supremum (infimum) of A
comparable to it by this relation.                                  in X. Combining Eqs. (41) and (42) then yields
    Let (X, ) be a partially ordered set and let A
be a subset of X. An element a+ ∈ (A, ) is said                      sup A = {a← ∈ ΩA : a←          u∀u ∈ (ΩA ,    )}
                                                                      X
to be a maximal element of A with respect to if                                                                         (43)
                                                                      inf A = {→ a ∈ ΛA : l   →   a∀l ∈ (ΛA ,     )}
            (∀a ∈ (A,    ))(a+   a) ⇒ a = a+ ,               (39)     X
                                                                    where ΩA = {u ∈ X : (∀a ∈ A)(a           u)} and
that is, iff there is no a ∈ A with a = a+ and
                                                                    ΛA = {l ∈ X : (∀a ∈ A)(l a)} are the sets of all
a a+ .21 Expressed otherwise, this implies that an
                                                                    upper and lower bounds of A in X. Equation (43)
element a+ of a subset A ⊆ (X, ) is maximal in
                                                                    may be expressed in the equivalent but more trans-
(A, ) iff it is true that
                                                                    parent form as
             (a     a+ ∈ A)(for every a ∈ (A, )                      a← = sup A ⇔ (a ∈ A ⇒ a           a← )
                             comparable to a+ ) ;            (40)            X
                                                                         ∧ (a0 a← ⇒ a0 b a← for some b ∈ A)
thus a+ in A is a maximal element of A iff it is
strictly greater than every other comparable element                 → a = inf A ⇔ (a ∈ A ⇒→ a a)
                                                                             X
of A. This of course does not mean that each ele-                         ∧ (→ a   a 1 ⇒→ a     b     a1 for some b ∈ A)
ment a of A satisfies a        a+ because every pair
                                                                                                                      (44)
of elements of a partially ordered set need not be
comparable: in a totally ordered set there can be                   to imply that a← (→ a) is the upper (lower) bound of
at most one maximal element. In comparison, an                      A in X which precedes (succeeds) every other upper
element a∞ of a subset A ⊆ (X, ) is the unique                      (lower) bound of A in X. Notice that uniqueness in
maximum (largest, greatest, last) element of A iff                   the definitions above is a direct consequence of the
                                                                    uniqueness of greatest and least elements of a set.
        (a        a∞ ∈ A)(for every a ∈ (A,      )) ,        (41)   It must be noted that whereas maximal and max-
implying that a∞ is the element of A that is strictly               imum are properties of the particular subset and
larger than every other element of A. As in the case                have nothing to do with anything outside it, up-
of the maximal, although this also does not require                 per and lower bounds of a set are defined only with
all elements of A to be comparable to each other,                   respect to a superset that may contain it.
it does require a∞ to be larger than every element                      The following example, beside being useful in
of A. The dual concepts of minimal and minimum                      Zorn’s lemma, is also of great significance in fix-
can be similarly defined by essentially reversing the                ing some of the basic ideas needed in our future
roles of a and b in relational expressions like a b.                arguments involving classes of sets ordered by the
     The last concept needed to formalize Zorn’s                    inclusion relation.
lemma is that of an upper bound: For a subset
                                                                    Example 4.1. Let X = P({a, b, c}) be ordered
(A, ) of a partially ordered set (X, ), an element
                                                                    by the inclusion relation ⊆. The subset A =
u of X is an upper bound of A in X iff
                                                                    P({a, b, c}) − {a, b, c} has three maximals {a, b},
       (a     u ∈ (X,     ))(for every a ∈ (A,       ))      (42)   {b, c} and {c, a} but no maximum as there is no

21
   If  is an order relation in X then the strict relation in X corresponding to , given by x y ⇔ (x y) ∧ (x = y), is
not an order relation because unlike , is not reflexive even though it is both transitive and asymmetric.
Toward a Theory of Chaos      3179

A∞ ∈ A satisfying A          A∞ for every A ∈ A,                      The statement of Zorn’s lemma and its proof
while P({a, b, c}) − ∅ the three minimals {a}, {b}                can now be completed in three stages as follows.
and {c} but no minimum. This shows that a sub-                    For Theorem 4.1 below that constitutes the most
set of a partially ordered set may have many max-                 significant technical first stage, let g be a function
imals (minimals) without possessing a maximum                     on (X, ) that assigns to every x ∈ X an immediate
(minimum), but a subset has a maximum (mini-                      successor y ∈ X such that
mum) iff this is its unique maximal (minimal). If
                                                                   M(x) = {y        x : ∃x∗ ∈ X satisfying x         x∗     y}
A = {{a, b}, {a, c}}, then every subset of the in-
tersection of the elements of A, namely {a} and ∅,                are all the successors of x in X with no element
are lower bounds of A, and all supersets in X of the              of X lying strictly between x and y. Select a rep-
union of its elements — which in this case is just                resentative of M (x) by a choice function f C such
{a, b, c} — are its upper bounds. Notice that while               that
the maximal (minimal) and maximum (minimum)
                                                                                  g(x) = fC (M(x)) ∈ M (x)
are elements of A, upper and lower bounds need
not be contained in their sets. In this class (X , ⊆)             is an immediate successor of x chosen from the
of subsets of a set X, X+ is a maximal element of                 many possible in the set M (x). The basic idea in
X iff X+ is not contained in any other subset of X,                the proof of the first of the three-parts is to express
while X∞ is a maximum of X iff X∞ contains every                   the existence of a maximal element of a partially
other subset of X.                                                ordered set X in terms of the existence of a fixed
     Let A := {Aα ∈ X }α∈D be a non-empty sub-                    point in the set, which follows as a contradiction
class of (X , ⊆), and suppose that both Aα and                    of the assumed hypothesis that every point in X
   Aα are elements of X . Since each Aα is ⊆-less                 has an immediate successor. Our basic application
than Aα , it follows that Aα is an upper bound                    of immediate successors in the following will be to
of A; this is also the smallest of all such bounds                classes X ⊆ (P(X), ⊆) of subsets of a set X or-
because if U is any other upper bound then every                  dered by inclusion. In this case for any A ∈ X , the
Aα must precede U by Eq. (42) and therefore so                    function g can be taken to be the superset
must Aα (because the union of a class of subsets
                                                                             g(A) = A       fC (G(A) − A) ,
of a set is the smallest that contain each member                                                                           (46)
of the class: Aα ⊆ U ⇒         Aα ⊆ U for subsets                 where      G(A) = {x ∈ X − A : A          {x} ∈ X }
(Aα ) and U of X). Analogously, since Aα is ⊆-
                                                                  of A. Repeated application of g to A then generates
less than each Aα it is a lower bound of A; that it
                                                                  a principal filter, and hence an associated sequence,
is the greatest of all the lower bounds L in X fol-
                                                                  based at A.
lows because the intersection of a class of subsets is
the largest that is contained in each of the subsets:             Theorem 4.1. Let (X,            ) be a partially ordered set
L ⊆ Aα ⇒ L ⊆ Aα for subsets L and (Aα ) of X.                     that satisfies
Hence the supremum and infimum of A in (X , ⊆)
                                                                  (ST1) There is a smallest element x0 of X which
given by
                                                                  has no immediate predecessor in X.
               A← = sup(X ,⊆) A =             A                   (ST2) If C ⊆ X is a totally ordered subset in X,
                                        A∈A
                                                                  then c∗ = supX C is in X.
                                                         (45)     Then there exists a maximal element x + of X which
and            →A   = inf (X ,⊆) A =         A                    has no immediate successor in X.
                                       A∈A

are both elements of (X , ⊆). Intuitively, an upper               Proof. Let T ⊆ (X,    ) be a subset of X. If the con-
(respectively, lower) bound of A in X is any subset               clusion of the theorem is false then the alternative
of X that contains (respectively, is contained in)                (ST3) Every element x ∈ T has an immediate suc-
every member of A.                                                cessor g(x) in T 22

22
  This makes T , and hence X, inductively defined infinite sets. It should be realized that (ST3) does not mean that every
member of T is obtained from g, but only ensures that the immediate successor of any element of T is also in T. The infimum
→ T of these towers satisfies the additional property of being totally ordered (and is therefore essentially a sequence or net) in
(X, ) to which (ST2) can be applied.
3180    A. Sengupta

leads, as shown below, to a contradiction that can          g(c)     t then g(c)       g(t); this combined with
be resolved only by the conclusion of the theo-             (c = t) ⇒ (g(c) = g(t)) yields g(c)       g(t). On the
rem. A subset T of (X, ) satisfying conditions              other hand, t c for every t ∈ Cg requires g(t) c
(ST1)−(ST3) is sometimes known as an g-tower                as otherwise (t c) ⇒ (c g(t)) would, from the
or an g-sequence: an obvious example of a tower             resulting consequence t      c    g(t), contradict the
is (X, ) itself. If                                         assumed hypothesis that g(t) is the immediate suc-
                                                            cessor of t. Hence, Cg is a g-tower in X.
         →T   =    {T ∈ T : T is an x0 − tower}                  To complete the proof that g(c) ∈ CT , and
is the (P(X), ⊆)-infimum of the class T of all se-           thereby the argument that CT is a tower, we first
quential towers of (X, ), we show that this small-          note that as → T is the smallest tower and Cg is built
est sequential tower is infact a sequential totally         from it, Cg =→ T must infact be → T itself. From
ordered chain in (X, ) built from x0 by the g-              Eq. (48) therefore, for every t ∈→ T either t g(c)
function. Let the subset                                    or g(c)     t, so that g(c) ∈ CT whenever c ∈ CT .
                                                            This concludes the proof that CT is actually the
CT = {c ∈ X : (∀t ∈→ T )(t         c∨c    t)} ⊆ X   (47)    tower → T in X.
                                                                 From (ST2), the implication of the chain C T
of X be an g-chain in → T in the sense that
[cf. Eq. (37)] it is that subset of X each of whose                         CT =→ T = C g                    (49)
elements is comparable with some element of → T .           being the minimal tower → T is that the supre-
The conditions (ST1)–(ST3) for CT can be verified            mum t← of the totally ordered → T in its own
as follows to demonstrate that CT is an g-tower.            tower (as distinct from in the tower X: recall that
(1) x0 ∈ CT , because it is less than each x ∈ → T .        → T is a subset of X) must be contained in itself,
(2) Let c← = supX CT be the supremum of the                 that is
    chain CT in X so that by (ST2), c← ∈ X. Let                       sup(CT ) = t← ∈→ T ⊆ X .               (50)
    t ∈ → T . If there is some c ∈ CT such that t c,                   CT
    then surely t c← . Else, c t for every c ∈ CT           This however leads to the contradiction from
    shows that c←          t because c← is the small-       (ST3) that g(t← ) be an element of → T , unless of
    est of all the upper bounds t of CT . Therefore         course
    c← ∈ C T .
(3) In order to show that g(c) ∈ C whenever c ∈ C                             g(t← ) = t← ,                  (51)
    it needs to be verified that for all t ∈ → T , ei-       which because of (49) may also be expressed equiv-
    ther t     c ⇒ t       g(c) or c   t ⇒ g(c)     t.      alently as g(c← ) = c← ∈ CT . As the sequential
    As the former is clearly obvious, we investigate        totally ordered set → T is a subset of X, Eq. (48)
    the latter as follows; note that g(t) ∈ → T by          implies that t← is a maximal element of X which
    (ST3). The first step is to show that the subset         allows (ST3) to be replaced by the remarkable in-
                                                            verse criterion that
       Cg = {t ∈   →T   : (∀c ∈ CT )(t   c ∨ g(c)   t)}
                                                                (ST3 ) If x ∈ X and w precedes x, w            x,
                                                     (48)   then w ∈ X, that is obviously false for a general
       of → T , which is a chain in X (observe the in-      tower T . In fact, it follows directly from Eq. (39)
       verse roles of t and c here as compared to that in   that under (ST3 ) any x+ ∈ X is a maximal ele-
       Eq. (47)), is a tower: Let t← be the supremum        ment of X iff it is a fixed point of g as given by
       of Cg and take c ∈ C. If there is some t ∈ Cg        Eq. (51). This proves the theorem and also demon-
       for which g(c) t, then clearly g(c) t ← . Else,      strates how, starting from a minimum element of a
       t    x for each t ∈ Cg shows that t←         c be-   partially ordered set X, (ST3) can be used to gen-
       cause t← is the smallest of all the upper bounds     erate inductively a totally ordered sequential subset
       c of Cg . Hence t← ∈ Cg .                            of X leading to a maximal x+ = c← ∈ (X, ) that
                                                            is a fixed point of the generating function g when-
    Property (ST3) for Cg follows from a small              ever the supremum t← of the chain → T is in X.
yet significant modification of the above arguments
in which the immediate successors g(t) of t ∈ C g
formally replaces the supremum t← of Cg . Thus              Remark. The proof of this theorem, despite its ap-
given a c ∈ C, if there is some t ∈ Cg for which            parent length and technically involved character,
Toward a Theory of Chaos      3181

carries the highly significant underlying message                    be the set of all the totally ordered subsets of
that                                                                (X, ). Since X is a collection of (sub)sets of X,
                                                                    we order it by the inclusion relation on X and use
     Any inductive sequential g-construction of                     the tower Theorem to demonstrate that (X , ⊆) has
     an infinite chained tower CT starting with                      a maximal element C← , which by the definition of
     a smallest element x0 ∈ (X, ) such that                        X , is the required maximal chain in (X, ).
     a supremum c← of the g-generated sequen-                            Let C be a chain in X of the chains in (X, ).
     tial chain CT in its own tower is contained                    In order to apply the tower Theorem to (X , ⊆) we
     in itself, must necessarily terminate with a                   need to verify hypothesis (ST2) that the smallest
     fixed point relation of the type (51) with re-
     spect to the supremum. Note from Eqs. (50)                                      C∗ = sup C =           C                 (53)
                                                                                             X
     and (51) that the role of (ST2) applied to                                                       C∈C
     a fully ordered tower is the identification of                  of the possible upper bounds of C [see Eq. (45)] is
     the maximal of the tower — which depends                       a chain of (X, ). Indeed, if x1 , x2 ∈ X are two
     only on the tower and has nothing to do                        points of Csup with x1 ∈ C1 and x2 ∈ C2 , then
     with anything outside it — with its supre-                     from the ⊆-comparability of C1 and C2 we may
     mum that depends both on the tower and its                     choose x1 , x2 ∈ C1 ⊇ C2 , say. Thus x1 and x2 are
     complement.                                                      -comparable as C1 is a chain in (X, ); C∗ ∈ X
                                                                    is therefore a chain in (X, ) which establishes that
Thus although purely set-theoretic in nature, the
                                                                    the supremum of a chain of (X , ⊆) is a chain in
filter-base associated with a sequentially totally or-
                                                                    (X, ).
dered set may be interpreted to lead to the usual
                                                                         The tower Theorem 4.1 can now be applied to
notions of adherence and convergence of filters and
                                                                    (X , ⊆) with C0 as its smallest element to construct
thereby of a generated topology for (X, ), see
                                                                    a g-sequentially towered fully ordered subset of X
Appendix A.1 and Example A.1.3. This very sig-
                                                                    consisting of chains in X
nificant apparent inter-relation between topologies,
filters and orderings will form the basis of our                          CT = {Ci ∈ P(X) : Ci ⊆ Cj for i ≤ j ∈ N}
approach to the condition of maximal ill-posedness
                                                                             = → T ⊆ P(X)
for chaos.
     In the second stage of the three-stage                         of (X , ⊆) — consisting of the common elements of
programme leading to Zorn’s lemma, the tower                        all g-sequential towers T ∈ T of (X , ⊆) — that in-
Theorem 4.1 and the comments of the preceding                       fact is a principal filter base of chained subsets of
paragraph are applied at a higher level to a very                   (X, ) at C0 . The supremum (chain in X) C← of CT
special class of the power set of a set, the class of               in CT must now satisfy, by Theorem 4.1, the fixed
all the chains of a partially ordered set, to directly              point g-chain of X
lead to the physically significant                                          sup(CT ) = C← = g(C← ) ∈ CT ⊆ P(X) ,
                                                                            CT
Theorem       4.2(Hausdorff Maximal Principle).
                                                                    where the chain g(C) = C fC (G(C) − C) with
Every partially ordered set (X, ) has a maximal
                                                                    G(C) = {x ∈ X − C : C {x} ∈ X }, is an im-
totally ordered subset.23
                                                                    mediate successor of C obtained by choosing one
Proof.   Here the base level is                                     point x = fC (G(C) − C) from the many possible in
                                                                    G(C) − C such that the resulting g(C) = C {x}
X = {C ∈ P(X) : C is a chain in (X,              )} ⊆ P(X)          is a strict successor of the chain C with no others
                                                        (52)        lying between it and C. Note that C← ∈ (X , ⊆) is

23
   Recall that this means that if there is a totally ordered chain C in (X, ) that succeeds C+ , then C must be C+ so that no
chain in X can be strictly larger than C+ . The notation adopted here and below is the following: If X = {x, y} is a non-empty
set, then X := P(X) = {A : A ⊆ X} = {∅, {x}, {y}, {x, y}} is the set of subsets of X, and X := P 2 (X) = {A : A ⊆ X }, the set
of all subsets of X , consists of the 16 elements ∅, {∅}, {{x}}, {{y}}, {{x, y}}, {{∅}, {x}}, {{∅}, {y}}, {{∅}, {x, y}}, {{x}, {y}},
{{x}, {x, y}}, {{y}, {x, y}}, {{∅}, {x}, {y}}, {{∅}, {x}, {x, y}}, {{∅}, {y}, {x, y}}, {{x}, {y}, {x, y}}, and X : an element
of P 2 (X) is a subset of P(X), any element of which is a subset of X. Thus if C = {0, 1, 2} is a chain in (X = {0, 1, 2}, ≤),
then C = {{0}, {0, 1}, {0, 1, 2}} ⊆ P(X) and C = {{{0}}, {{0}, {0, 1}}, {{0}, {0, 1}, {0, 1, 2}}} ⊆ P 2 (X) represent chains
in (P(X), ⊆) and (P 2 (X), ⊆), respectively.
3182     A. Sengupta



                                  (X, )                X = {C ⊆ X : C is a chain in (X, )}



                                                                Tower Theorem 4.1



                                                      CT =   {T ⊆ (X , ⊆) : T is a C0 − tower}




                                                     supC (CT ) = C← = g(C← ) ∈ CT ⊆ (X , ⊆)
                                                         T
                         Hausdorff Maximal
                            Chain Theorem
                                                                      Zorn Lemma



                                                             (u ∈ X     c) (∀c ∈ (C← , ))

Fig. 10. Application of Zorn’s Lemma to (X, ). Starting with a partially ordered set (X, ), construct:
(a) The one-level higher subset X = {C ∈ P(X) : C is a chain in (X, )} of P(X) consisting of all the totally ordered subsets
of (X, ),
(b) The smallest common g-sequential totally ordered towered chain CT = {Ci ∈ P(X) : Ci ⊆ Cj for i ≤ j} ⊆ P(X) of all
sequential g-towers of X by Theorem 4.1, which in fact is a principal filter base of totally ordered subsets of (X, ) at the
smallest element C0 .
(c) Apply Hausdorff Maximal Principle to (X , ⊆) to get the subset supCT (CT ) = C← = g(C← ) ∈ CT ⊆ P(X) of (X, ) as the
supremum of (X , ⊆) in CT . The identification of this supremum as a maximal element of (X , ⊆) is a consequence of (ST2)
and Eqs. (50), (51) that actually puts the supremum into X itself.
By returning to the original level (X, )
(d) Zorn’s Lemma finally yields the required maximal element u ∈ X as an upper bound of the maximal totally ordered subset
(C← , ) of (X, ).
The dashed segment denotes the higher Hausdorff (X , ⊆) level leading to the base (X, ) Zorn level.


only one of the many maximal fully ordered subsets             Indeed, if there is an element v ∈ X that is compa-
possible in (X, ).                                             rable to u and v u, then v cannot be in C ← as it is
                                                               necessary for every x ∈ C← to satisfy x u. Clearly
    With the assurance of the existence of a max-
                                                               then C← {v} is a chain in (X, ) bigger than C←
imal chain C← among all fully ordered subsets of
                                                               which contradicts the assumed maximality of C ←
a partially ordered set (X, ), the arguments are
                                                               among the chains of X.
completed by returning to the basic level of X.              1
                                                                    The sequence of steps leading to Zorn’s Lemma,
Theorem 4.3 (Zorn’s Lemma). Let (X, ) be a
                                                               and thence to the maximal of a partially ordered set,
partially ordered set such that every totally ordered
                                                               is summarized in Fig. 10.
subset of X has an upper bound in X. Then X
                                                                    The three examples below of the application of
has at least one maximal element with respect to
                                                               Zorn’s Lemma clearly reflect the increasing com-
its order.
                                                               plexity of the problem considered, with the maxi-
Proof.  The proof of this final part is a mere ap-              mals a point, a subset, and a set of subsets of X,
plication of the Hausdorff Maximal Principle on                 so that these are elements of X, P(X) and P 2 (X),
the existence of a maximal chain C← in X to the                respectively.
hypothesis of this theorem that C← has an upper
                                                                Example 4.2
bound u in X that quickly leads to the identifica-
tion of this bound as a maximal element x + of X.               (1) Let X = ({a, b, c},          ) be a three-point base-
Toward a Theory of Chaos   3183

                             £ ¢                             ¡ ¢ 
                                                                                                                
                                                                                                                !   


                                       ¦ §            ¨               ¤ ¢        ©  
                                                                                            
                                                                                                                                     
                                                                                                                                         ! 
                                                                                                                §%
                                                                                                                !  
            ¤                                                                  ¡
                                                                                                                 
                                                                                                                 
                                                                                               $
                                                                                                                                      
                                                                                                                                         !
                                                                    ¥


                                                ¦                                                                 #

                                                (a)                                                             (b)

Fig. 11. Tree diagrams of two partially ordered sets where two points are connected by a line iff they are comparable to
each other, with the solid lines linking immediate neighbors and the dashed, dotted and dashed–dotted lines denoting second,
third and fourth generation orderings according to the principle of transitivity of the order relation. There are 8 × 2 chains of
(a) and 7 chains of (b) starting from respective smallest elements with the immediate successor chains shown in solid lines.
The 17 point set X = {0, 1, 2, . . . , 15, 16} in (a) has two maximals but no maximum, while in (b) there is a single maximum
of P({a, b, c}), and three maximals without any maximum for P({a, b, c}) − {a, b, c}. In (a), let A = {1, 3, 4, 7, 9, 10, 15},
B = {1, 3, 4, 6, 7, 13, 15}, C = {1, 3, 4, 10, 11, 16} and D = {1, 3, 4}. The upper bounds of D in A are 7, 10 and 15 without
any supremum (as there is no smallest element of {7, 10, 15}), and the upper bounds of D in B are 7 and 15 with sup B (D) = 7,
while supC (D) = 10. Finally the maximal, maximum and the supremum in A of {1, 3, 4, 7} are all the same illustrating how
the supremum of a set can belong to itself. Observe how the supremum and upper bound of a set are with reference to its
complement in contrast with the maximum and maximal that have nothing to do with anything outside the set.



level ground set ordered lexicographically, that is                                       of X , with
a     b     c. A chain C of the partially ordered
                                                                                          sup(CT ) = C← = {a, b, c} = g(C← ) ∈ CT ⊆ P(X)
Hausdorff-level set X consisting of subsets of X                                            CT
given by Eq. (52) is, for example, {{a}, {a, b}} and
the six g-sequential chained towers                                                       the only maximal element of P(X). Zorn’s Lemma
                                                                                          now assures the existence of a maximal element of
                C1   = {∅,          {a}, {a, b}, {a, b, c}} ,                             c ∈ X. Observe how the maximal element of (X, )
                C2   = {∅,          {a}, {a, c}, {a, b, c}}                               is obtained by going one level higher to X at the
                                                                                          Hausdorff stage and returning to the base level X
                C3   = {∅,          {b}, {a, b}, {a, b, c}} ,                             at Zorn, see Fig. 10 for a schematic summary of this
                C4   = {∅,          {b}, {b, c}, {a, b, c}}                               sequence of steps.
                C5   = {∅,          {c}, {a, c}, {a, b, c}} ,                             (2) Basis of a vector space. A linearly independent
                C6   = {∅,          {c}, {b, c}, {a, b, c}}                               set of vectors in a vector space X that spans the
                                                                                          space is known as the Hamel basis of X. To prove
built from the smallest element ∅ corresponding to                                        the existence of a Hamel basis in a vector space,
the six distinct ways of reaching {a, b, c} from ∅                                        Zorn’s lemma is invoked as follows.
along the sides of the cube marked on the figure                                               The ground base level of the linearly indepen-
with solid lines, all belong to X ; see Fig. 11(b).                                       dent subsets of X
An example of a tower in (X , ⊆) which is not
a chain is T = {∅, {a}, {b}, {c}, {a, b}, {a, c},                                                X = {{xij }J ∈ P(X) : Span({xij }J )
                                                                                                            j=1                   j=1
{b, c}, {a, b, c}}. Hence the common infimum tow-                                                     = 0 ⇒ (αj )J = 0 ∀ J ≥ 1} ⊆ P(X) ,
                                                                                                                j=1
ered chained subset
                                                                                          with Span({xij }J ) := J αj xij , is such that no
                                                                                                          j=1     j=1
            CT = {∅, {a, b, c}} =→ T ⊆ P(X)                                               x ∈ X can be expressed as a linear combination of
3184   A. Sengupta

the elements of X − {x}. X clearly has a smallest                        Compared to this purely algebraic concept of
element, say {xi1 }, for some non-zero xi1 ∈ X. Let                 basis in a vector space, is the Schauder basis in
the higher Hausdorff level                                           a normed space which combines topological struc-
                                                                    ture with the linear in the form of convergence: If
X = {C ∈ P 2 (X) : C is a chain in (X , ⊆)} ⊆ P 2 (X)
                                                                    a normed vector space contains a sequence (e i )i∈Z+
and collection of the chains                                        with the property that for every x ∈ X there is an
                                                                    unique sequence of scalars (αi )i∈Z+ such that the
 CiK = {{xi1 }, {xi1 , xi2 }, . . . , {xi1 , xi2 , . . . , xiK }}
                                                                    remainder x − (α1 e1 + α2 e2 + · · · + αI eI ) ap-
     ∈ P 2 (X)                                                      proaches 0 as I → ∞, then the collection (e i ) is
of X comprising linearly independent subsets of X                   known as a Schauder basis for X.
be g-built from the smallest {xi1 }. Any chain C of                 (3) Ultrafilter. Let X be a set. The set F S =
X is bounded above by the union C∗ = supX C =                       {Sα ∈ P(X) : Sα Sβ = ∅, ∀α = β} ⊆ P(X) of all
  C∈C C which is a chain in X containing {x i1 },                   nonempty subsets of X with finite intersection prop-
thereby verifying (ST2) for X. Application of the                   erty is known as a filter subbase on X and F B =
tower theorem to X implies that the element                         {B ⊆ X : B = i∈I⊂D Si }, for I ⊂ D a finite subset
     CT = {Ci1 , Ci2 , . . . , Cin , . . .} =→ T ⊆ P 2 (X)          of a directed set D, is a filter-base on X associated
                                                                    with the subbase F S; cf. Appendix A.1. Then the
in X of chains of X is a g-sequential fully ordered                 filter generated by F S consisting of every superset
towered subset of (X, ⊆) consisting of the com-                     of the finite intersections B ∈F B of sets of F S is
mon elements of all g-sequential towers of (X, ⊆),                  the smallest filter that contains the subbase F S and
that in fact is a chained principal ultrafilter on                   base F B. For notational simplicity, we will denote
(P(X), ⊆) generated by the filter-base {{{x i1 }}} at                the subbase F S in the rest of this example simply
{xi1 }, where                                                       by S.
           T = {Ci1 , Ci2 , . . . , Cjn , Cjn+1 , . . .}                 Consider the base-level ground set of all filter
                                                                    subbases on X
for some n ∈ N is an example of non-chained g-
tower whenever (Cjk )∞ is neither contained in nor
                     k=n                                            S = S ∈ P 2 (X) :           R = ∅ for every finite subset of S
contains any member of the (Cik )∞ chain. Haus-
                                 k=1                                                    ∅=R⊆S
dorff’s chain theorem now yields the fixed-point g-
                                                                      ⊆ P 2 (X),
chain C← ∈ X of X

sup(CT ) =C← = {{xi1 }, {xi1 , xi2 }, {xi1 , xi2 , xi3 }, . . .}    ordered by inclusion in the sense that S α ⊆ Sβ for
CT                                                                  all α β ∈ D, and let the higher Hausdorff-level
          =g(C← ) ∈ CT ⊆ P 2 (X)                                    ˜
                                                                    X = {C ∈ P 3 (X) : C is a chain in (S, ⊆)} ⊆ P 3 (X)

as a maximal totally ordered principal filter on X                   comprising the collection of the totally ordered
that is generated by the filter-base {{x i1 }} at xi1 ,              chains
whose supremum B = {xi1 , xi2 , . . .} ∈ P(X) is, by
Zorn’s lemma, a maximal element of the base level                      Cκ = {{Sα }, {Sα , Sβ }, . . . , {Sα , Sβ , . . . , Sκ }}
X . This maximal linearly independent subset of X                          ∈ P 3 (X)
is the required Hamel basis for X: Indeed, if the
                                                                    of S be g-built from the smallest {Sα } then an ultra-
span of B is not the whole of X, then Span(B) x,
                                                                    filter on X is a maximal member S+ of (S, ⊆) in the
with x ∈ Span(B) would, by definition, be a linearly
        /
                                                                    usual sense that any subbase S on X must necessar-
independent set of X strictly larger than B, con-
                                                                    ily be contained in S+ so that S+ ⊆ S ⇒ S = S+
tradicting the assumed maximality of the later. It
                                                                    for any S ⊆ P(X) with FIP. The tower theorem
needs to be understood that since the infinite basis
                                                                    now implies that the element
cannot be classified as being linearly independent,
we have here an important example of the supre-                          ˜                                     ˜
                                                                        CT = {Cα , Cβ , . . . , Cν , . . .} = → T ⊆ P 3 (X)
mum of the maximal chained set not belonging to
the set even though this criterion was explicitly used                                              ˜
                                                                    of P 4 (X), which is a chain in X of the chains of S,
in the construction process according to (ST2) and                  is a g-sequential fully ordered towered subset of the
(ST3).                                                                                                              ˜
                                                                    common elements of all sequential towers of ( X, ⊆)
Toward a Theory of Chaos     3185

that is a chained principal ultrafilter on (P 2 (X), ⊆)        element of (X, ). This sequence is now applied, as
generated by the filter-base {{{Sα }}} at {Sα }, where         in Example 4.2(1), to the set of arbitrary relations
            ˜                                                 Multi(X) on an infinite set X in order to formulate
            T = {Cα , Cβ , . . . , Cσ , Cς , . . .},
                                                              our definition of chaos that follows.
is an obvious example of non-chained g-tower when-                 Let f be a noninjective map in Multi(X) and
ever (Cσ ) is neither contained in, nor contains, any         P (f ) the number of injective branches of f . Denote
member of the Cα -chain. Hausdorff’s chain theorem             by
now yields the fixed-point C˜ ∈ X
                              ←
                                   ˜
                                                               F = {f ∈ Multi(X) : f is a noninjective function
sup(CT ) = C˜ = {{Sα }, {Sα , Sβ }, {Sα , Sβ , Sγ }, . . .}
      ˜       ←
 ˜
CT
                                                                   on X} ⊆ Multi(X)
          = g(C˜ ) ∈ CT ⊆ P 3 (X)
               ←
                      ˜
                                                              the resulting basic collection of noninjective func-
as a maximal totally ordered g-chained towered sub-           tions in Multi(X).
set of X that is, by Zorn’s lemma, a maximal ele-
ment of the base level subset S of P 2 (X). C˜ is              (i) For every α in some directed set D, let F have
                                                     ←
a chained principal ultrafilter on (P(X), ⊆) gener-                 the extension property
ated by the filter-base {{Sα }} at Sα , while S+ =                       (∀fα ∈ F )(∃fβ ∈ F ) : P (fα ) ≤ P (fβ )
{Sα , Sβ , Sγ , . . .} ∈ P 2 (X) is an (non-principal) ul-
trafilter on X — characterized by the property that            (ii) Let a partial order on Multi(X) be defined,
any collection of subsets on X with FIP (that is any               for fα , fβ ∈ Map(X) ⊆ Multi(X) by
filter subbase on X) must be contained in the max-                          P (fα ) ≤ P (fβ ) ⇔ fα      fβ ,        (54)
imal set S+ having FIP — that is not a principal
filter unless Sα is a singleton set {xα }.                           with P (f ) := 1 for the smallest f , define a par-
                                                                    tially ordered subset (F, ) of Multi(X). This
     What emerges from these applications of Zorn’s                 is actually a preorder on Multi(X) in which
Lemma is the remarkable fact that infinities (the                    functions with the same number of injective
dot-dot-dots) can be formally introduced as “limit-                 branches are equivalent to each other.
ing cases” of finite systems in a purely set-theoretic         (iii) Let
context without the need for topologies, metrics or
convergences. The significance of this observation                    Cν = {fα ∈ Multi(X) : fα          fν } ∈ P(F ) ,
will become clear from our discussions on filters and                                 ν ∈ D,
topology leading to Sec. 4.2 below. Also, the obser-
                                                                   be g-chains of non-injective functions of
vation on the successive iterates of the power sets
                                                                   Multi(X) and
P(X) in the examples above was to suggest their
anticipated role in the complex evolution of a dy-                 X = {C ∈ P(F ) : C is a chain in (F, )} ⊆ P(F )
namical system that is expected to play a significant
part in our future interpretation and understanding                denote the corresponding Hausdorff level of all
of this adaptive and self-organizing phenomenon of                 chains of F , with
nature.                                                              CT = {Cα , Cβ , . . . , Cν , . . .} =→ T ⊆ P(F )

End Tutorial 5                                                     being a g-sequential in X . By Hausdorff Max-
                                                                   imal Principle, there is a maximal fixed-point
                                                                   g-towered chain C← ∈ X of F
From the examples in Tutorial 5, it should be clear
                                                                           sup(CT ) = C← = {fα , fβ , . . .}
that the sequential steps summarized in Fig. 10 are                         CT
involved in an application of Zorn’s lemma to show                                   = g(C← ) ∈ CT ⊆ P(F ).
that a partially ordered set has a maximal element
with respect to its order. Thus for a partially or-           Zorn’s Lemma applied to this maximal chain yields
dered set (X, ), form the set X of all chains C in            its supremum as the maximal element of C ← , and
X. If C+ is a maximal chain of X obtained by the              thereby of F . It needs to be appreciated, as in the
Hausdorff Maximal Principle from the chain C of                case of the algebraic Hamel basis, that the exis-
all chains of X, then its supremum u is a maximal             tence of this maximal non-functional element was
3186   A. Sengupta

obtained purely set theoretically as the “limit” of a         [Devaney, 1989] and is also maximally non-injective;
net of functions with increasing nonlinearity, with-          the tent map is therefore chaotic on D + . In con-
out resorting to any topological arguments. Because           trast, the examples of Secs. 1 and 2 are not chaotic
it is not a function, this supremum does not be-              as the maps are not topologically transitive, al-
long to the functional g-towered chain having it              though the Liapunov exponents, as in the case of
as a fixed point, and this maximal chain does not              the tent map, are positive. Here the (f n ) are iden-
possess a largest, or even a maximal, element, al-            tified with the iterates of f, and the “fixed point”
though it does have a supremum.24 The supremum                as one through which graphs of all the functions on
is a contribution of the inverse functional relations         residual index subsets pass. When the set of points
   −
(fα ) in the following sense. From Eq. (2), the net           D+ is dense in [0, 1] and both D+ and [0, 1] − D+ =
                                                                        ∞
of increasingly non-injective functions of Eq. (54)           [0, 1] − i=0 f −i (Per(f )) (where Per(f ) denotes the
implies a corresponding net of increasingly multi-            set of periodic points of f ) are totally disconnected,
valued functions ordered inversely by the inverse             it is expected that at any point on this complement
                       −      −
relation fα fβ ⇔ fβ          fα . Thus the inverse re-        the behavior of the limit will be similar to that on
lations which are as much an integral part of graph-          D+ : these points are special as they tie up the iter-
ical convergence as are the direct relations, have a          ates on Per(f ) to yield the multifunctions. There-
smallest element belonging to the multifunctional             fore in any neighborhood U of a D+ -point, there
class. Clearly, this smallest element as the required         is an x0 at which the forward orbit {f i (x0 )}i≥0 is
supremum of the increasingly non-injective tower              chaotic in the sense that
of functions defined by Eq. (54), serves to complete
the significance of the tower by capping it with a             (a) the sequence neither diverges nor does it con-
“boundary” element that can be taken to bridge the                verge in the image space of f to a periodic orbit
classes of functional and non-functional relations                of any period, and
on X.                                                         (b) the Liapunov exponent is given by
     We are now ready to define a maximally ill-
                                                                                                      1/n
posed problem f (x) = y for x, y ∈ X in terms of a                        def           df n (x0 )
maximally non-injective map f as follows.                          λ(x0 ) = lim ln
                                                                                n→∞        dx
                                                                                        n−1
Definition 4.1 (Chaotic map).       Let A be a non-                                  1              df (xi )
                                                                          = lim               ln            , xi = f i (x0 ) ,
empty closed set of a compact Hausdorff space X. A                               n→∞ n                dx
                                                                                        i=0
function f ∈ Multi(X) equivalently the sequence of
functions (fi ) is maximally non-injective or chaotic         which is a measure of the average slope of an orbit
on A with respect to the order relation (54) if               at x0 or equivalently of the average loss of informa-
                                                              tion of the position of a point after one iteration, is
(a) for any fi on A there exists an fj on A satisfying
                                                              positive. Thus an orbit with positive Liapunov expo-
    fi fj for every j  i ∈ N.
                                                              nent is chaotic if it is not asymptotic (that is neither
(b) the set D+ consists of a countable collection of
                                                              convergent nor adherent, having no convergent sub-
    isolated singletons.
                                                              orbit in the sense of Appendix A.1) to an unstable
Definition 4.2 (Maximally ill-posed problem).                  periodic orbit or to any other limit set on which the
Let A be a non-empty closed set of a compact Haus-            dynamics is simple. A basic example of a chaotic
dorff space X and let f be a functional relation in            orbit is that of an irrational in [0, 1] under the shift
Multi(X). The problem f (x) = y is maximally ill-             map and that of the chaotic set its closure, the full
posed at y if f is chaotic on A.                              unit interval.
                                                                    Let f ∈ Map((X, U)) and suppose that A =
     As an example of the application of these def-           {f j (x0 )}j∈N is a sequential set corresponding to the
initions, on the dense set D+ , the tent map sat-             orbit Orb(x0 ) = (f j (x0 ))j∈N , and let fRi (x0 ) =
                                                                       j
isfies both the conditions of sensitive dependence               j≥i f (x0 ) be the i-residual of the sequence
on initial conditions and topological transitivity            (f j (x0 ))j∈N , with F Bx0 = {fRi (x0 ) : Res(N) → X

24
  A similar situation arises in the following more intuitive example. Although the subset A = {1/n} n∈Z+ of the interval
I = [−1, 1] has no smallest or minimal elements, it does have the infimum 0. Likewise, although A is bounded below by any
element of [−1, 0), it has no greatest lower bound in [−1, 0) (0, 1].
Toward a Theory of Chaos        3187

for all i ∈ N} being the decreasingly nested filter-                       It is important that the difference in the dy-
base associated with Orb(x0 ). The so-called ω-limit                 namical behavior of the system on D+ and its com-
set of x0 given by                                                   plement be appreciated. At any fixed point x of f i
        def                                                          in D+ (or at its equivalent images in [x]) the dynam-
ω(x0 ) = {x ∈ X : (∃nk ∈ N)(nk → ∞)(f nk (x0 ) → x)}                 ics eventually gets attached to the (equivalent) fixed
       = {x ∈ X : (∀N ∈ Nx )(∀fRi ∈F Bx0 )                           point, and the sequence of iterates converges graph-
              (fRi (x0 )   N = ∅)}                         (55)      ically in Multi(X) to x (or its equivalent points).
                                                                     When x ∈ D+ , however, the orbit A = {f i (x)}i∈N
                                                                                /
is simply the adherence set adh(f j (x0 )) of the se-                is chaotic in the sense that (f i (x)) is not asymp-
quence (f j (x0 ))j∈N , see Eq. (A.39); hence Defini-                 totically periodic and not being attached to any
tion A.1.11 of the filter-base associated with a se-                  particular point they wander about in the closed
quence and Eqs. (A.16), (A.24), (A.31) and (A.34)                    chaotic set ω(x) = Der(A) containing A such that
allow us to express ω(x0 ) more meaningfully as                      for any given point in the set, some subsequence of
                                                                     the chaotic orbit gets arbitrarily close to it. Such
                ω(x0 ) =         Cl(fRi (x0 )) .           (56)      sequences do not converge anywhere but only fre-
                           i∈N
                                                                     quent every point of Der(A). Thus although in the
It is clear from the second of Eqs. 55) that for                     limit of progressively larger iterations there is com-
a continuous f and any x ∈ X, x ∈ ω(x0 ) im-                         plete uncertainty of the outcome of an experiment
plies f (x) ∈ ω(x0 ) so that the entire orbit of x                   conducted at either of these two categories of ini-
lies in ω(x0 ) whenever x does imply that the ω-                     tial points, whereas on D+ this is due to a random
limit set is positively invariant; it is also closed be-             choice from a multifunctional set of equally prob-
cause the adherent set is a closed set according to                  able outputs as dictated by the specific conditions
Theorem A.1.3. Hence x0 ∈ ω(x0 ) ⇒ A ⊆ ω(x0 )                        under which the experiment was conducted at that
reduces the ω-limit set to the closure of A with-                    instant, on its complement the uncertainty is due
out any isolated points, A ⊆ Der(A). In terms                        to the chaotic behavior of the functional iterates
of Eq. (A.33) involving principal filters, Eq. (56)                   themselves. Nevertheless it must be clearly under-
in this case may be expressed in the more trans-                     stood that this later behavior is entirely due to the
parent form ω(x0 ) = Cl(F P({f j (x0 )}∞ )) where
                                            j=0                      multifunctional limits at the D+ points which com-
the principal filter F P({f j (x0 )}∞ ) at A consists
                                    j=0
                                                                     pletely determine the behavior of the system on its
of all supersets of A = {f j (x0 )}∞ , and ω(x0 ) rep-               complement. As an explicit illustration of this sit-
                                   j=0
resents the adherence set of the principal filter at                  uation, recall that for the shift map 2x mod(1) the
A, see the discussion following Theorem A.1.3. If                    D+ points are the rationals on [0, 1], and any ir-
A represents a chaotic orbit under this condition,                   rational is represented by a non-terminating and
then ω(x0 ) is sometimes known as a chaotic set                      non-repeating decimal so that almost all decimals
[Alligood et al., 1997]; thus the chaotic orbit in-                  in [0, 1] in any base contain all possible sequences
finitely often visits every member of its chaotic                     of any number of digits. For the logistic map, the
set25 which is simply the ω-limit set of a chaotic                   situation is more complex, however. Here the on-
orbit that is itself contained in its own limit set.                 set of chaos marking the end of the period dou-
Clearly the chaotic set is positive invariant, and                   bling sequence at λ∗ = 3.5699456 is signaled by the
from Theorem A.1.3 and its corollary it is also com-                 disappearance of all stable fixed points, Fig. 13(c),
pact. Furthermore, if all (sub)sequences emanating                   with Fig. 13(a) being a demonstration of the sta-
from points x0 in some neighborhood of the set con-                  ble limits for λ = 3.569 that show up as conver-
verge to it, then ω(x0 ) is called a chaotic attractor,              gence of the iterates to constant valued functions
see [Alligood et al., 1997]. As common examples of                   (rather than as constant valued inverse functions)
chaotic sets that are not attractors mention may be                  at stable fixed points, shown more emphatically in
made of the tent map with a peak value larger than                   Fig. 12(a). What actually happens at λ ∗ is shown in
1 at 0.5, and the logistic map with λ ≥ 4 again with                 Fig. 16(a) in the next subsection: the almost verti-
a peak value at 0.5 exceeding 1.                                     cal lines produced at a large, but finite, iterations i

25
   How does this happen for A = {f i (x0 )}i∈N that is not the constant sequence (x0 ) at a fixed point? As i ∈ N increases,
points are added to {x0 , f (x0 ), . . . , f I (x0 )} not, as would be the case in a normal sequence, as a piled-up Cauchy tail, but as
points generally lying between those already present; recall a typical graph as of Fig. 9, for example.
3188   A. Sengupta

           1           1                                           1          1


                                                                   9         9
           2           2                                                     5          5
           6           6                                           1         1          2
                                                                             2
           7           7
           3           3                                                     10
                                                                10
           1           1




                           Stable 1-cycle, 1-cycle, λ = 2.95
                                   Stable λ = 2.95                             Stable 2-cycle, 2-cycle, λ = 3.4
                                                                                        Stable λ = 3.4
                           Graphical limit at limit at 9001
                                   Graphical 9001                            Graphical limit at limit at 9001-9002
                                                                                    Graphical 9001-9002

           0           0
                       0.2        0.2                          1 0         1 0                     0.7     0.7 1      1
                                        (a)
                                        (a)
                                              (a)                                           (b)    (b)
                                                                                                  (b)


                   1                                                         1
               1                                                       1

           9       9
           5       5
           1       1
           2       2

           6 6
          10 10
           8 8



                         Stable 4-cycle, λ = 3.5 = 3.5
                              Stable 4-cycle, λ                                     Stable Stable 8-cycle, λ = 3.55
                                                                                           8-cycle, λ = 3.55
                       Graphical limit limit at 9001-9004
                           Graphical at 9001-9004                                      Graphical limit at 9001-9008
                                                                                  Graphical limit at 9001-9008

               0   0                            0.7   0.7      1       0 1    0                     0.7   0.7   1     1
                                        (c)                                                       (d)

Fig. 12. Fixed points and cycles of logistic map. The isolated fixed point of (b) yields two non-fixed points to which the
iterates converge simultaneously in the sense that the generated sequence converges to one iff it converges to the other. This
suggests that nonlinear dynamics of a system can lead to a situation in which sequences in a Hausdorff space may converge
to more than one point. Since convergence depends on the topology (Corollary to Theorem A.1.5), this may be interpreted
to mean that nonlinearity tends to modify the basic structure of a space. The sequence of points generated by the iterates of
the map are marked on the y-axis of (a)–(c) in italics. The singletons {x} are ω-limit sets of the respective fixed point x and
is generated by the constant sequence (x, x, . . .). Whereas in (a) this is the limit of every point in (0, 1), in the other cases
these fixed points are isolated in the sense of Definition 2.3. The isolated points, however, give rise to sequences that converge
to more than one point in the form of limit cycles as shown in (b)–(d).
                                                1       1

(the multifunctions are generated only in the limit-                       chaos therefore, λx(1 − x) is chaotic for the values
ing sense of i → ∞ and represent a boundary be-                            of λ  λ∗ that are shown in Fig. 16. We return to
tween functional and non-functional relations on a                         this case in the following subsection.
set), decrease in magnitude with increasing itera-                              As an example of chaos in a noniterative sys-
tions until they reduce to points. This gives rise to                      tem, we investigate the following question: While
a (totally disconnected) Cantor set on the y-axis in                       maximality of non-injectiveness produced by an in-
contrast with the connected intervals that the mul-                        creasing number of injective branches is necessary
tifunctional limits at λ  λ∗ of Figs. 16(b)–16(d)                         for a family of functions to be chaotic, is this also
produce. By our characterization Definition 4.1 of                          sufficient for the system to be chaotic? This is an
Toward a Theory of Chaos   3189

            .507     .507                                            1     1




              .5       .5 Iterate 9000 of 9000 of logistic map
                                  Iterate logistic map                      9000 iterations on logistic logistic map
                                                                                    9000 iterations on map
              .473     .473   Order at λ = 3.569= 3.569 .488
                                      Order at λ                     0 0.1 0 0.1
                                                                     .488             at λ = 3.569= 3.569
                                                                                             at λ                 1     1
                                         (a)   (a)                                         (b)     (b)
                                   (a)                                                           (b)

           .511      .511                                        1          1




           .493         Iterate 9000 of logistic map
                                 Iterate 9000 of logistic map     9000 iterations on logistic map
                                                                           9000 iterations on logistic map
                   .493
              .472    .472                                     0.1      at
                         ‘‘Edge of”Edge of chaos” ∗ λ = λ∗ 0 .487 0 0.1 λ∗ = 3.5699456
                                   chaos” at λ = λ at .487                        at λ∗ = 3.5699456 1                    1
                                   (c)
                                       (c)      (c)                                      (d)      (d)
                                                                                                 (d)

           .511      .511                                        1          1




                                                     1   1




           .493                 Iterate 9000 of logistic map
                   .493Iterate 9000 of logistic map              9000 iterationsiterations on logistic map
                                                                          9000 on logistic map
              .472    .472 Chaos at Chaos at λ = 3.57 .487 0 .487 0 0.1 at λ = 3.57 λ = 3.57
                                      λ = 3.57                0.1                    at                 1                1
                                     (e)       (e)                                      (f)       (f)
                                   (e)                                                           (f)

Fig. 13. Multifunctional and cobweb plots of λx(1 − x). Comparison of the graphs for the three values of λ shown in (a)–(f)
illustrates how the dramatic changes in the character of the former are conspicuously absent in the conventional plots that
display no perceptible distinction between the three cases.




                                                 1        1
3190   A. Sengupta

important question especially in the context of a            functions (f i )i∈N which may be verified by reference
non-iterative family of functions where fixed points          to Definition A.1.8, Theorem A.1.3 and the proofs
are no longer relevant.                                      of Theorems A.1.4 and A.1.5, together with the di-
     Consider     the     sequence    of    functions        rected set Eq. (A.10) with direction (A.11). The
| sin(πnx)|∞ . The graphs of the subsequence
            n=1                                              basin of attraction of the attractor is A 1 because
| sin(2n−1 πx)| and of the sequence (tn (x)) on [0, 1]       the graphical limit (D+ , F (D+ )) (G(R+ ), R+ ) of
are qualitatively similar in that they both contain          Definition 3.1 may be obtained, as indicated above,
2n−1 of their functional graphs each on a base of            by a proper choice of sequences associated with
1/2n−1 . Thus both | sin(2n−1 πx)|∞ and (tn (x))∞
                                  n=1              n=1       A. Note that in the context of iterations of func-
converge graphically to the multifunction [0,1] on           tions, the graphical limit (D+ , y0 ) of the sequence
the same set of points equivalent to 0. This is suf-         (f n (x)) denotes a stable fixed point x∗ with im-
ficient for us to conclude that | sin(2 n−1 πx)|∞ ,n=1        age x∗ = f (x∗ ) = y0 to which iterations start-
and hence | sin(πnx)|∞ , is chaotic on the infinite
                       n=1                                   ing at any point x ∈ D+ converge. The graphi-
equivalent set [0]. While Fig. 9 was a comparison            cal limits (xi0 , R+ ) are generated with respect to
of the first four iterates of the tent and absolute           the class {xi∗ } of points satisfying f (xi0 ) = xi∗ ,
sine maps, Fig. 14 shows the “converged” graphical           i = 0, 1, 2, . . . equivalent to unstable fixed point
limits after 17 iterations.                                  x∗ := x0∗ to which inverse iterations starting at any
                                                             initial point in R+ must converge. Even though only
4.1. The chaotic attractor                                   x∗ is inverse stable, an equivalent class of graph-
One of the most fascinating characteristics of chaos         ically converged limit multis is produced at every
in dynamical systems is the appearance of attrac-            member of the class xi∗ ∈ [x∗ ], resulting in the far-
tors the dynamics on which are chaotic. For a subset         reaching consequence that every member of the class
A of a topological space (X, U) such that R(f (A))           is as significant as the parent fixed point x ∗ from
is contained in A — in this section, unless otherwise        which they were born in determining the dynam-
stated to the contrary, f (A) will denote the graph          ics of the evolving system. The point to remember
and not the range (image) of f — which ensures               about infinite intersections of a collection of sets
that the iteration process can be carried out in A,          having finite intersection property, as in Eq. (58), is
let                                                          that this may very well be empty; recall, however,
          fRi (A) =              f j (A)                     that in a compact space this is guaranteed not to be
                         j≥i∈N                               so. In the general case, if core(A) = ∅ then A is the
                                                      (57)   principal filter at this core, and Atr(A 1 ) by Eqs. (58)
                     =                      f j (x)          and (A.33) is the closure of this core, which in this
                         j≥i∈N     x∈A                       case of topology being induced by the filterbase, is
generate the filter-base F B with Ai := fRi (A) ∈F B          just the core itself. A1 by its very definition, is a pos-
being decreasingly nested, Ai+1 ⊆ Ai for all i ∈ N,          itively invariant set as any sequence of graphs con-
in accordance with Definition A.1.1. The existence            verging to Atr(A1 ) must be eventually in A1 : the
of a maximal chain with a corresponding maxi-                entire sequence therefore lies in A 1 . Clearly, from
mal element as asssured by the Hausdorff Maximal              Theorem A.3.1 and its corollary, the attractor is a
Principle and Zorn’s Lemma respectively implies a            positively invariant compact set. A typical attrac-
nonempty core of F B. As in Sec. 3 following Defi-            tor is illustrated by the derived sets in the second
nition 3.3, we now identify the filterbase with the           column of Fig. 22 which also illustrates that the set
neighborhood base at f ∞ which allows us to define            of functional relations are open in Multi(X); specifi-
                           def                               cally functional–non-functional correspondences are
              Atr(A1 ) = adh(F B)
                                                             neutral-selfish related as in Fig. 22, 3–2, with the
                                                      (58)
                           =                Cl(Ai )          attracting graphical limit of Eq. (58) forming the
                                 A i ∈F B                    boundary of (finitely) many-to-one functions and
as the attractor of the set A1 , where the last equal-       the one-to-(finitely) many multifunctions.
ity follows from Eqs. (59) and (20) and the closure                Equation (58) is to be compared with the im-
is wth respect to the topology induced by the neigh-         age definition of an attractor [Stuart  Humphries,
borhood filter base F B. Clearly the attractor as de-         1996] where f (A) denotes the range and not the
fined here is the graphical limit of the sequence of          graph of f . Then Eq. (58) can be used to define a
Toward a Theory of Chaos      3191


             1      1                                       1        1




             0       017th iterate of tent map map 0 .0008 0 Graph of | sin(216 πx)| 16 πx)|
                              17th iterate of tent .0008            Graph of | sin(2      .0008                  .0008
                                    (a)
                                    (a)      (a)                                  (b)      (b)
                                                                                           (b)

Fig. 14. Similarity in the behavior of the graphs of (a) tent and (b) | sin(216 πx)| maps at 17 iterations demonstrate chaoticity
of the latter.



sequence of points xk ∈ Ank and hence the subset                  be identified with the subset R+ on the y-axis on
       def                                                        which the multifunctional limits G : R + → X of
ω(A) = {x ∈ X : (∃nk ∈ N)(nk → ∞)(∃xk ∈ Ank )                     graphical convergence are generated, with its basin
       (f nk (xk ) → x)}                                          of attraction being contained in the D + associated
                                                                  with the injective branch of f that generates R + . In
       = {x ∈ X : (∀N ∈ Nx )(∀Ai ∈ A)                             summary it may be concluded that since definitions
             (N     Ai = ∅)}                             (59)     (59) and (61) involve both the domain and range
                                                                  of f , a description of the attractor in terms of the
as the corresponding attractor of A that satisfies an              graph of f , like that of Eq. (58), is more pertinent
equation formally similar to (58) with the difference              and meaningful as it combines the requirements of
that the filter-base A is now in terms of the image                both these equations. Thus, for example, as ω(A) is
f (A) of A, which allows the adherence expression                 not the function G(R+ ), this attractor does not in-
to take the particularly simple form                              clude the equivalence class of inverse stable points
                  ω(A) =         Cl(f i (A)) .           (60)     that may be associated with x∗ , see for example
                           i∈N                                    Fig. 15.
                                                                       From Eq. (59), we may make the particularly
The complimentary subset excluded from this def-
                                                                  simple choice of (xk ) to satisfy f nk (x−k ) = x so
inition of ω(A), as compared to Atr(A 1 ), that is                               −n
                                                                  that x−k = fB k (x), where x−k ∈ [x−k ] := f −nk (x)
required to complete the formalism is given by
                                                                  is the element of the equivalence class of the inverse
Eq. (61) below. Observe that the equation for ω(A)
                                                                  image of x corresponding to the injective branch f B .
is essentially Eq. (A.15), even though we prefer to
                                                                  This choice is of special interest to us as it is the
use the alternate form of Eq. (A.16) as this brings
                                       1      1                   class that generates the G-function on R + in graph-
out more clearly the frequenting nature of the se-
                                                                  ical convergence. This allows us to express ω(A) as
quence. The basin of attraction
                                                                                                        −n
      Bf (A) = {x ∈ A : ω(x) ⊆ Atr(A)}                               ω(A) = {x ∈ X : (∃nk ∈ N)(nk → ∞)(fB k (x)
             = {x ∈ A : (∃nk ∈ N)(nk → ∞)                (61)               = x−k converges in (X, U))} ;                   (62)
               (f nk (x) → x∗ ∈ ω(A))                             note that the x−k of this equation and the xk of
of the attractor is the smallest subset of X in which             Eq. (59) are, in general, quite different points.
sequences generated by f must eventually lie in or-                   A simple illustrative example of the construc-
der to adhere at ω(A). Comparison of Eqs. (62)                    tion of ω(A) for the positive injective branch of
with (33) and (61) with (32) show that ω(A) can                   the homeomorphism (4x2 − 1)/3, −1 ≤ x ≤ 1, is
3192   A. Sengupta

                                                               1

                                          4x2 − 1
                                    f=
                                             3           0.8                                       x−1
                                                                                fB
                                                         0.6
                                                                                                   x
                                                         0.4                      f2
                                                                                   B

                                                         0.2                          3
                                                                                     fB

                                                                                                  1
                             −0.8 −0.6 −0.4 −0.2               0   0.2    0.4    0.6      0.8

                            5   4          3               2
                                                                                                  −0.25
                            4   3          2               1
                                                          −0.4                         x1  x2
                                                                                        x−1 x−3
Fig. 15. The attractor for f (x) = (4x2 − 1)/3, for −1 ≤ x ≤ 1. The converging sequences are denoted by arrows on the
right, and (xk ) are chosen according to the construction shown. This example demonstrates how although A ⊆ f (A), where
A = [0, 1] is the domain of the positive injective branch of f , the succeeding images (f i (A))i≥1 satisfy the required restriction
for iteration, and A in the discussion above can be taken to be f (A); this is permitted as only a finite number of iterates
is thereby discarded. It is straightforward to verify that Atr(A1 ) = (−1, [−0.25, 1]) ((−1, 1), −0.25) (1, [−0.25, 1]) with
F (x) = −0.25 on D− = (−1, 1) = D+ and G(y) = 1, and −1 on R− = [−0.25, 1] = R+ . By comparison, ω(A) from either
its definition Eq. (59) or from the equivalent intersection expression Eq. (60), is simply the closed interval R + = [−0.25, 1].
The italicized iterate numbers on the graphs show how the oscillations die out with increasing iterations from x = ±1 and
approach −0.25 in all neighborhoods of 0.


shown in Fig. 15, where the arrow-heads denote the                  quirements of an attractor to lead to the concept of
converging sequences f ni (xi ) → x and f ni −m (xi ) →             a chaotic attractor to be that on which the dynam-
x−m which proves invariance of ω(A) for a homeo-                    ics is chaotic in the sense of Definitions 4.1. and 4.2.
morphic f ; here continuity of the function and its                 Hence
inverse is explicitly required for invariance. Posi-
tive invariance of a subset A of X implies that for                 Definition 4.3 (Chaotic Attractor).       Let A be a
any n ∈ N and x ∈ A, f n (x) = yn ∈ A, while                        positively invariant subset of X. The attractor
negative invariance assures that for any y ∈ A,                     Atr(A) is chaotic on A if there is sensitive depen-
f −n (y) = x−n ∈ A. Invariance of A in both the                     dence on initial conditions for all x ∈ A. The sensi-
forward and backward directions therefore means                     tive dependence manifests itself as multifunctional
that for any y ∈ A and n ∈ N, there exists a x ∈ A                  graphical limits for all x ∈ D+ and as chaotic orbits
                                                      1
such that f n (x) = y. In interpreting this figure, it               when x ∈ D+ .
may be useful to recall from Definition 4.1 that an
increasing number of injective branches of f is a                        The picture of chaotic attractors that emerge
necessary, but not sufficient, condition for the oc-                  from the foregoing discussions and our characteri-
currence of chaos; thus in Figs. 12(a) and 15, in-                  zation of chaos of Definition 4.1 is that it it is a
creasing noninjectivity of f leads to constant valued               subset of X that is simultaneously “spiked” multi-
limit functions over a connected D+ in a manner                     functional on the y-axis and consists of a dense col-
similar to that associated with the classical Gibb’s                lection of singleton domains of attraction on the x-
phenomenon in the theory of Fourier series.                         axis. This is illustrated in Fig. 16 which shows some
     Graphical convergence of an increasingly non-                  typical chaotic attractors. The first four diagrams
linear family of functions implied by its increasing                (a)–(d) are for the logistic map with (b)–(d) show-
non-injectivity may now be combined with the re-                    ing the 4-, 2- and 1-piece attractors for λ = 3.575,
Toward a Theory of Chaos     3193

3.66, and 3.8, respectively that are in qualitative                     significant as λ = λ∗ marks the boundary between
agreement with the standard bifurcation diagram                         the nonchaotic region for λ  λ∗ and the chaotic for
reproduced in (e). Figures 16(b)–16(d) have the ad-                     λ  λ∗ (this is to be understood as being suitably
vantage of clearly demonstrating how the attractors                     modified by the appearance of the nonchaotic win-
are formed by considering the graphically converged                     dows for some specific intervals in λ  λ ∗ ). At λ∗
limit as the object of study unlike in (e) which shows                  the generated fractal Cantor set Λ is an attractor
the values of the 501–1001th iterates of x 0 = 1/2 as                   as it attracts almost every initial point x 0 so that
a function of λ. The difference in (a) and (b) for a                     the successive images xn = f n (x0 ) converge toward
change of λ from λ  λ∗ = 3.5699456 to 3.575 is                         the Cantor set Λ. In (f) the chaotic attractors for


           1      1                                                1      1




                                  λ = 3.5699456
                           λ = 3.5699456                                                     λ=
                                                                                      λ = 3.575 3.575
                      Iterates = 2001 − 2004 2004
                             Iterates = 2001 −                                 Iterates = 2001 − 2004− 2004
                                                                                      Iterates = 2001
           0      0                                            1   0 1    0                                    1       1
                                  (a)     (a)                                             (b)   ----
                                  (a)                                                           (b)

           1      1                                                1      1




                                    λ
                              λ = 3.66= 3.66                                                  λ
                                                                                        λ = 3.8 = 3.8
                      Iterates = 2001 − 2004 2004
                            Iterates = 2001 −                                  Iterates = 2001 − 2002− 2002
                                                                                      Iterates = 2001
           0      0                                            1   0 1     0                                   1       1
                                   (c)    (c)                                             (d)
                                  (c)                                                           (d)
Fig. 16. Chaotic attractors for different values of λ. For the logistic map the usual bifurcation (e) shows the chaotic attractors
for λ  λ∗ = 3.5699456, while (a)–(d) display the graphical limits for four values of λ chosen for the Cantor set and 4,- 2-,
and 1-piece attractors, respectively. In (f) the attractor [0, 1] (where the dotted lines represent odd iterates and the solid lines
even iterates of f ) disappear if f is reflected about the x-axis. The function ff (x) is given by

                                                           2(1 + x)/3     0 ≤ x  1/2
                                                ff (x) =
                                                           2(1 − x)       1/2 ≤ x ≤ 1 .




                                                   1       1
3194   A. Sengupta

         1       1                                        1      1




              λ4 = 3.449 3.449
                    λ4 =
          0      0 λ4    λ∗
                         λ4      λ∗                 4      0 4   0        First 12 iterates
                                                                                 First 12 iterates   1   1
                               (e)
                              (e)     (e)                                        (f)   (f)
                                                                                       (f)
                                               Fig. 16.    (Continued )


the piecewise continuous function on [0, 1]                   on ordered sets, just as the role of the choice of
                                                              an appropriate problem-dependent basis was high-
                     2(1 + x) , 0 ≤ x  1
                    
                                                             lighted at the end of Sec. 2. Chaos as manifest in its
                          3                 2
                    
           ff (x) =                                           attractors is a direct consequence of the increasing
                    
                     2(1 − x),    1                          nonlinearity of the map with increasing iteration;
                                     ≤ x ≤ 1,
                    
                                   2                          we reemphasize that this is only a necessary condi-
is [0, 1] where the dotted lines represent odd iterates       tion so that the increasing nonlinearities of Figs. 12
and the full lines even iterates of f ; here the attrac-      and 15 eventually lead to stable states and not to
tor disappears if the function is reflected about the          chaotic instability. Under the right conditions as
x-axis.                                                       enunciated following Fig. 10, chaos appears to be
                                                              the natural outcome of the difference in the behav-
4.2. Why chaos? A preliminary                                 ior of a function f and its inverse f − under their
     inquiry                                                  successive applications. Thus f = f f − f allows f
The question as to why a natural system should                to take advantage of its multi-inverse to generate
evolve chaotically is both interesting and relevant,          all possible equivalence classes that are available, a
and this section attempts to advance a plausible an-          feature not accessible to f − = f − f f − . As we have
swer to this inquiry that is based on the connection          seen in the foregoing, equivalence classes of fixed
between topology and convergence contained in the             points, stable and unstable, are of defining signif-
Corollary to Theorem A.1.5. Open sets are group-              icance in determining the ultimate behavior of an
ings of elements that govern convergence of nets              evolving dynamical system and as the eventual (as
and filters, because the required property of being            also frequent) charcter of a filter or net in a set
either eventually or frequently in (open) neighbor-           is dictated by open neighborhoods of points of the
hoods of a point determines the eventual behavior             set, it is postulated that chaoticity on a set X leads
of the net; recall in this connection the unusual 1
                                             1      con-      to a reformulation of the open sets of X to equiv-
vergence characteristics in cofinite and cocountable           alence classes generated by the evolving map f , see
spaces. Conversely for a given convergence charac-            Example 2.4(3). Such a redefinition of open sets of
teristic of a class of nets, it is possible to infer the      equivalence clases allow the evolving system to tem-
topology of the space that is responsible for this            porally access an ever increasing number of states
convergence, and it is this point of view that we             even though the equivalent fixed points are not fixed
adopt here to investigate the question of this sub-           under iterations of f except for the parent of the
section: recall that our Definitions 4.1 and 4.2 were          class, and can be considered to be the governing
based on purely algebraic set-theoretic arguments             criterion for the cooperative or collective behavior
Toward a Theory of Chaos     3195

of the system. The predominance of the role of f −        to points x = y ∈ X then x ∼ y: x is of course
in f = f f −f in generating the equivalence classes       equivalent to itself while x, y, z are equivalent to
(that is exploiting the many-to-one character) of f       each other iff they are simultaneously in every open
is reflected as limit multis for f (i.e. constant f −      set in which the net may eventually belong. This
on R+ ) in f − = f − f f − ; this interpretation of the   hall-mark of chaos can be appreciated in terms of
dynamics of chaos is meaningful as graphical con-         a necessary obliteration of any separation property
vergence leading to chaos is a result of pointwise bi-    that the space might have originally possessed, see
convergence of the sequence of iterates of the func-      property (H3) in Appendix A.3. We reemphasize
tions generated by f . But as f is a noninjective         that a set in this chaotic context is required to act
function on X possessing the property of increasing       in a dual capacity depending on whether it carries
nonlinearity in the form of increasing noninjectivity     the initial or final topology under M.
with iteration, various cycles of disjoint equivalence         This preliminary inquiry into the nature of
classes are generated under iteration, see for exam-      chaos is concluded in the final section of this paper.
ple Fig. 9(a) for the tent map. A reference to Fig.
shows that the basic set XB , for a finite number n of
                                                          5. Graphical Convergence Works
iterations of f , contains the parent of each of these
open equivalent sets in the domain of f , with the        We present in this section some real evidence in
topology on XB being the corresponding p-images           support of our hypothesis of graphical conver-
of these disjoint saturated open sets of the domain.      gence of functions in Multi(X, Y ). The example is
In the limit of infinite iterations of f leading to the    taken from neutron transport theory, and concerns
multifunction M (this is the f ∞ of Sec. 4.1), the        the discretized spectral approximation [Sengupta,
generated open sets constitute a basis for a topol-       1988, 1995] of Case’s singular eigenfunction solu-
                                                          tion of the monoenergetic neutron transport equa-
ogy on D(f ) and the basis for the topology of R(f )
                                                          tion, [Case  Zweifel, 1967]. The neutron transport
are the corresponding M-images of these equivalent
                                                          equation is a linear form of the Boltzmann equation
classes. It is our contention that the motive force be-
                                                          that is obtained as follows. Consider the neutron-
hind evolution toward a chaos, as defined by Defini-
                                                          moderator system as a mixture of two species of
tion 4.1, is the drive toward a state of the dynamical
                                                          gases each of which satisfies a Boltzmann equation
system that supports ininality of the limit multi M;      of the type
see Appendix A.2 with the discussions on Fig. and
Eq. (26) in Sec. 2. In the limit of infinite iterations     ∂
                                                              + vi ·      fi (r, v, t)
therefore, the open sets of the range R(f ) ⊆ X are        ∂t
the multi images that graphical convergence gener-
ates at each of these inverse-stable fixed points. X        =      dv     dv1     dv1         Wij (vi → v ; v1 → v1 )
therefore has two topologies imposed on it by the                                        j
dynamics of f : the first of equivalence classes gen-           {fi (r, v , t)fj (r, v1 , t) − fi (r, v, t)fj (r, v1 , t)}
erated by the limit multi M in the domain of f and
                                                          where
the second as M-images of these classes in the range
of f . Quite clearly these two topologies need not be     Wij (vi → v ; v1 → v1 ) = |v − v1 |σij (v − v , v1 − v1 )
the same; their intersection therefore can be defined
to be the chaotic topology on X associated with the       σij being the cross-section of interaction between
chaotic map f on X. Neighborhoods of points in this       species i and j. Denote neutrons by subscript 1 and
topology cannot be arbitrarily small as they consist      the background moderator with which the neutrons
of all members of the equivalence class to which          interact by 2, and make the assumptions that
any element belongs; hence a sequence converging          (i) The neutron density f1 is much less compared
to any of these elements necessarily converges to         with that of the moderator f2 so that the terms
all of them, and the eventual objective of chaotic        f1 f1 and f1 f2 may be neglected in the neutron and
dynamics is to generate a topology in X with re-          moderator equations, respectively.
spect to which elements of the set can be grouped         (ii) The moderator distribution f2 is not affected
together in as large equivalence classes as possible      by the neutrons. This decouples the neutron and
in the sense that if a net converges simultaneously       moderator equations and leads to an equilibrium
3196       A. Sengupta

Maxwellian fM for the moderator while the neu-                        the continuous spectrum of µ. This distinction be-
trons are described by the linear equation                            tween the nature of the inverses depending on the
       ∂                                                              relative values of µ and ν suggests a wider “non-
          +v·          f (r, v, t)                                    function” space in which to look for the solutions of
       ∂t
                                                                      operator equations, and in keeping with the philos-
           =      dv     dv1        dv1 W12 (v → v ; v1 → v1 )        ophy embodied in Fig. of treating inverse prob-
                                                                      lems in the space of multifunctions, we consider
               {f (r, v , t)fM (v1 ) − f (r, v, t)fM (v1 )})          all Fν ∈ Multi(V (µ), R)) satisfying Eq. (63) to be
                                                                      eigenfunctions of µ for the corresponding eigenvalue
This is now put in the standard form of the neutron                   ν, leading to the following multifunctional solution
transport equation [Williams, 1971]                                   of (63)
 1 ∂                         ˆ                                                    (V (µ), 0)                 if ν ∈ V (µ)
                                                                                                                  /
      + Ω · v + S(E) Φ(r, E, Ω, t)                                    Fν (µ) =
 v ∂t
                                                                                  (V (µ) − ν, 0)   (ν, R))   if ν ∈ V (µ) ,
  =         dΩ                     ˆ ˆ            ˆ
                    dE S(r, E → E; Ω · Ω)Φ(r, E , Ω , t).             where V (µ) − ν is used as a shorthand for the inter-
                                                                      val V (µ) with ν deleted. Rewriting the eigenvalue
                                       ˆ
where E = mv 2 /2 is the energy and Ω the direc-                      equation (63) as µν (Fν (µ)) = 0 and comparing this
tion of motion of the neutrons. The steady state,                     with Fig. , allows us to draw the correspondences
monoenergetic form of this equation is Eq. (A.53)                                   f ⇔ µν
                                                1
           ∂Φ(x, µ)             c                                          X and Y ⇔ {Fν ∈ Multi(V (µ), R) :
       µ            + Φ(x, µ) =      Φ(x, µ )dµ ,
             ∂x                 2 −1                                                 Fν ∈ D(µν )}
                  0  c  1, −1 ≤ µ ≤ 1                                                                                 (64)
                                                                                 f (X) ⇔ {0 : 0 ∈ Y }
and its singular eigenfunction solution for x ∈                                   XB ⇔ {0 : 0 ∈ X}
(−∞, ∞) is given by Eq. (A.56)
                                                                                   f − ⇔ µ− .
                                                                                          ν
                                    −x/ν0
           Φ(x, µ) = a(ν0 )e                φ(µ, ν0 )
                                                                      Thus a multifunction in X is equivalent to 0 in X B
                                        x/ν0
                         + a(−ν0 )e            φ(−ν0 , µ)             under the linear map µν , and we show below that
                               1                                      this multifunction is in fact the Dirac delta “func-
                         +          a(ν)e−x/ν φ(µ, ν)dν ;             tion” δν (µ), usually written as δ(µ − ν). This sug-
                               −1                                     gests that in Multi(V (µ), R), every ν ∈ V (µ) is in
see Appendix A.4 for an introductory review of                        the point spectrum of µ, so that discontinuous func-
Case’s solution of the one-speed neutron transport                    tions that are pointwise limits of functions in func-
equation.                                                             tion space can be replaced by graphically converged
     The term “eigenfunction” is motivated by the                     multifunctions in the space of multifunctions. Com-
following considerations. Consider the eigenvalue                     pleting the equivalence class of 0 in Fig. , gives the
equation                                                              multifunctional solution of Eq. (63).
                                                                           From a comparison of the definition of ill-
       (µ − ν)Fν (µ) = 0,            µ ∈ V (µ),         ν∈R    (63)
                                                                      posedness (Sec. 2) and the spectrum (Table 1), it is
in the space of multifunctions Multi(V (µ), (−∞,                      clear that Lλ (x) = y is ill-posed iff
∞)), where µ is in either of the intervals [−1, 1]
                                                                      (1) Lλ not injective ⇔ λ ∈ P σ(Lλ ), which corre-
or [0, 1] depending on whether the given bound-
                                                                          sponds to the first row of Table 1.
ary conditions for Eq. (A.53) is full-range or half
                                                                      (2) Lλ not surjective ⇔ the values of λ correspond
range. If we are looking only for functional solu-
                                                                          to the second and third columns of Table 1.
tions of Eq. (63), then the unique function F that
                                                                      (3) Lλ is bijective but not open ⇔ λ is either in
satisfies this equation for all possible µ ∈ V (µ) and
                                                                          Cσ(Lλ ) or Rσ(Lλ ) corresponding to the second
ν ∈ R − V (µ) is Fν (µ) = 0 which means, according
                                                                          row of Table 1.
to Table 1, that the point spectrum of µ is empty
and (µ − ν)−1 exists for all ν. When ν ∈ V (µ), how-                       We verify in the three steps below that X =
ever, this inverse is not continuous and we show                      L1 [−1, 1] of integrable functions, ν ∈ V (µ) = [−1, 1]
below that in Map(V (µ), 0), ν ∈ V (µ) belongs to                     belongs to the continuous spectrum of µ.
Toward a Theory of Chaos    3197

                       Table 1. Spectrum of linear operator L ∈ Map(X). Here Lλ := L−λ satisfies
                       the equation Lλ (x) = 0, with the resolvent set ρ(L) of L consisting of all those
                       complex numbers λ for which L−1 exists as a continuous operator with dense
                                                       λ
                       domain. Any value of λ for which this is not true is in the spectrum σ(L)
                       of L, that is further subdivided into three disjoint components of the point,
                       continuous and residual spectra according to the criteria shown in the table.

                                                                               R(Lλ )

                             Lλ                 L−1
                                                 λ            R=X          Cl(R) = X             Cl(R) = X

                       Not injective             ···           P σ(L)        P σ(L)                P σ(L)
                                          Not continuous       Cσ(L)         Cσ(L)                 Rσ(L)
                       Injective
                                           Continuous           ρ(L)          ρ(L)                 Rσ(L)




(a) R(µν ) is dense, but not equal to L1 . The set                      Nevertheless although the net of functions
    of functions g(µ) ∈ L1 such that µ−1 g ∈ L1
                                         ν                                                           1
    cannot be the whole of L1 . Thus, for example,                        δνε (µ) =     −1 (1 + ν)/ε + tan−1 (1 − ν)/ε
                                                                                    tan
    the piecewise constant function g = const = 0
                                                                                               ε
    on |µ − ν| ≤ δ  0 and 0 otherwise is in L1                                     ×                    , ε0
    but not in R(µν ) as µ−1 g ∈ L1 . Nevertheless                                       (µ − ν)2 + ε2
                           ν
    for any g ∈ L1 , we may choose the sequence of                                                             1
                                                                        is in the domain of µν because         −1 δνε (µ)dµ   =1
    functions                                                           for all ε  0,
                                                                                            1
                        0,         if |µ − ν| ≤ 1/n                                 lim          |µ − ν|δνε (µ)dµ = 0
                                                                                    ε→0 −1
           gn (µ) =
                        g(µ),      otherwise                          implying that (µ − ν)−1 is unbounded.
                                                                           Taken together, (a) and (b) show that func-
    in R(µν ) to be eventually in every neighbor-                     tional solutions of Eq. (63) lead to state 2–2 in
                                                1                     Table 1; hence ν ∈ [−1, 1] = Cσ(µ).
    hood of g in the sense that limn→∞ −1 |g −
    gn | = 0.                                                     (c) The two integral constraints in (b) also mean
(b) The inverse (µ − ν)−1 exists but is not contin-                   that ν ∈ Cσ(µ) is a generalized eigenvalue
    uous. The inverse exists because, as noted ear-                   of µ which justifies calling the graphical limit
                                                                              G
    lier, 0 is the only functional solution of Eq. (63).              δνε (µ) → δν (µ) a generalized, or singular, eigen-

                                    20         20                            −32           32




                                                                   −0.5                               0.5     0.5
                                                                                0           0




                −0.5      −0.5       0          0      0.5                   −32          −32
                                    (a)        (a)                           (b)           (b)
                                   (a)                                                    (b)

Fig. 17. Graphical convergence of: (a) Poisson kernel δε (x) = ε/π(x2 + ε2 ) and (b) conjugate Poisson kernel Pε (x) =
x/(x2 + ε2 ) to the Dirac delta and principal value, respectively; the graphs, each for a definite ε-value, converges to the
respective limits as ε → 0.
3198    A. Sengupta

       function, see Fig. 17 which clearly indicates the         with
       convergence of the net of functions. 26                                   1
                                                                                           dµ                   1      ε→0
                                                                        πε = ε                  = 2 tan−1              −→ π .
                                                                                 −1   µ2   + ε2                 ε
     From the fact that the solution Eq. (A.56) of
the transport equation contains an integral involv-              These discretized equations should be compared
ing the multifunction φ(µ, ν), we may draw an in-                with the corresponding exact ones of Appendix A.4.
teresting physical interpretation. As the multi ap-              We shall see that the net of functions (65) con-
pears everywhere on V (µ) (i.e. there are no chaotic             verges graphically to the multifunction Eq. (A.55)
orbits but only the multifunctions that produce                  as ε → 0.
them), we have here a situation typical of maximal                    In the discretized spectral approximation,
ill-posedness characteristic of chaos: note that both            the singular eigenfunction φ(µ, ν) is replaced by
the functions comprising φε (µ, ν) are non-injective.            φε (µ, ν), ε → 0, with the integral in ν being replaced
As the solution (A.56) involves an integral over all             by an appropriate sum. The solution Eq. (A.58) of
ν ∈ V (µ), the singular eigenfunctions — that col-               the physically interesting half-space x ≥ 0 problem
lectively may be regarded as representing a chaotic              then reduces to [Sengupta, 1988, 1995]
substate of the system represented by the solution of
                                                                  Φε (x, µ) = a(ν0 )e−x/ν0 φ(µ, ν0 )
the neutron transport equation — combine with the
                                                                                      N
functional components φ(±ν0 , µ) to produce the
                                                                                 +          a(νi )e−x/νi φε (µ, νi )    µ ∈ [0, 1]
well-defined, non-chaotic, experimental end result
                                                                                      i=1
of the neutron flux Φ(x, µ).
     The solution (A.56) is obtained by assuming                                                                                (66)
Φ(x, µ) = e−x/ν φ(µ, ν) to get the equation for                  where the nodes {νi }N are chosen suitably. This
                                                                                           i=1
φ(µ, ν) to be (µ − ν)φ(µ, ν) = −cν/2 with the nor-               discretized spectral approximation to Case’s so-
              1
malization −1 φ(µ, ν) = 1. As µ−1 is not invert-
                                     ν                           lution has given surprisingly accurate numerical
ible in Multi(V (µ), R) and µνB : XB → f (X) does                results for a set of properly chosen nodes when
not exist, the alternate approach of regularization              compared with exact calculations. Because of its
was adopted in [Sengupta, 1988, 1995] to rewrite                 involved nature [Case  Zweifel, 1967], the exact
µν φ(µ, ν) = −cν/2 as µνε φε (µ, ν) = −cν/2 with                 calculations are basically numerical which leads to
µνε := µ − (ν + iε) being a net of bijective func-               nonlinear integral equations as part of the solu-
tions for ε  0; this is a consequence of the fact               tion procedure. To appreciate the enormous com-
that for the multiplication operator every non-real              plexity of the exact treatment of the half-space
λ belongs to the resolvent set of the operator. The              problem, we recall that the complete set of eigen-
family of solutions of the latter equation is given by           functions {φ(µ, ν0 ), {φ(µ, ν)}ν∈[0,1] } are orthogo-
[Sengupta 1988, 1995]                                            nal with respect to the half-range weight function
                                                                 W (µ) of half-range theory, Eq. (A.61), that is ex-
              cν     ν−µ          λε (ν)    ε                    pressed only in terms of solution of the nonlin-
φε (ν, µ) =                     +
               2 (µ − ν) 2 + ε2    πε (µ − ν)2 + ε2              ear integral equation Eq. (A.62). The solution of
                                                (65)             a half-space problem then evaluates the coefficients
                                                                 {a(ν0 ), a(ν)ν∈[0, 1] } from the appropriate half range
where the required normalization
                                          1
                                               φε (ν, µ) = 1     (that is 0 ≤ µ ≤ 1) orthogonality integrals satisfied
                                          −1
gives                                                            by the eigenfunctions {φ(µ, ν0 ), {φ(µ, ν)}ν∈[0, 1] }
                                                                 with respect to the weight W (µ), see Appendix A.4
                                  πε                             for the necessary details of the half-space problem
   λε (ν) =
                  tan−1 (1 + ν)/ε + tan−1 (1 − ν)/ε              in neutron transport theory.
                                                                      As may be appreciated from this brief introduc-
                           cν (1 + ν)2 + ε2
                  ×   1−      ln                                 tion, solutions to half-space problems are not sim-
                            4    (1 − ν)2 + ε2                   ple and actual numerical computations must rely a
            ε→0
            −→ πλ(ν)                                             great deal on tabulated values of the X-function.

26
   The technical definition of a generalized eigenvalue is as follows. Let L be a linear operator such that there exists in the
domain of L a sequence of elements (xn ) with xn = 1 for all n. If limn→∞ (L − λ)xn = 0 for some λ ∈ C, then this λ is
a generalized eigenvalue of L, the corresponding eigenfunction x∞ being a generalized eigenfunction.
Toward a Theory of Chaos      3199

Self-consistent calculations of sample benchmark                  nature of the exact theory, it is our contention that
problems performed by the discretized spectral ap-                the remarkable accuracy of these basic data, some
proximation in a full-range adaption of the half-                 of which is reproduced in Table 2, is due to the
range problem described below that generate all                   graphical convergence of the net of functions
necessary data, independent of numerical tables,                                                G
with the quadrature nodes {νi }N taken at the                                        φε (µ, ν) → φ(µ, ν)
                                    i=1
zero Legendre polynomials show that the full range                shown in Fig. 18; here ε = 1/πN so that ε → 0
formulation of this approximation [Sengupta, 1988,                as N → ∞. By this convergence, the delta
1995] can give very accurate results not only of inte-            function and principal values in [−1, 1] are the
grated quantities like the flux Φ and leakage of par-              multifunctions ([−1, 0), 0) (0, [0, ∞) ((0, 1], 0)
ticles out of the half space, but of also basic “raw”             and     {1/x}x∈[−1, 0) (0, (−∞, ∞)) {1/x}x∈(0, 1]
data like the extrapolated end point                              respectively. Tables 2 and 3, taken from [Sengupta,
       cν0       1
                      ν             cν 2        ν0 + ν            1988] and [Sengupta, 1995], show respectively the
z0 =                         1+            ln            dν       extrapolated end point and X-function by the
        4    0       N (ν)        1 − ν2        ν0 − ν
                                                                  full-range adaption of the discretized spectral ap-
                                                         (67)
                                                                  proximation for two different half-range problems
and of the X-function itself. Given the involved                  denoted as Problems A and B defined as




                                                  c = c = 0.3
                                                      0.3                                        c = c = 0.9
                                                                                                     0.9




                                                  c = c = 0.3
                                                      0.3                                      c = c = 0.9
                                                                                                   0.9
                                                N = 1000
                                                   N = 1000                                  N = 1000
                                                                                                N = 1000




                                      (a) (a)
                                       (a)                                             (b)
                                                                                   (b) (b)
Fig. 18. Rational function approximations φε (µ, ν) of the singular eigenfunction φ(µ, ν) at four different values of ν. N = 1000
denotes the “converged” multifunction φ, with the peaks at the specific ν-values chosen.
3200   A. Sengupta

                                                                              1
                      P roblem A Equation : µΦx + Φ = (c/2) −1 Φ(x, µ )dµ , x ≥ 0
                                 Boundary condition : Φ(0, µ) = 0 for µ ≥ 0
                                 Asymptotic condition : Φ → e−x/ν0 φ(µ, ν0 ) as x → ∞ .
                                                                              1
                      P roblem B Equation : µΦx + Φ = (c/2) −1 Φ(x, µ )dµ ,                         x≥0
                                 Boundary condition : Φ(0, µ) = 1 for µ ≥ 0
                                 Asymptotic condition : Φ → 0 as x → ∞ .

The full −1 ≤ µ ≤ 1 range form of the half                          of the full-range weight function µ as compared to
0 ≤ µ ≤ 1 range discretized spectral approxima-                     the half-range function W (µ), and the resulting sim-
tion replaces the exact integral boundary condition                 plicity of the orthogonality relations that follow, see
at x = 0 by a suitable quadrature sum over the val-                 Appendix A.4. The basic data of z0 and X(−ν)
ues of ν taken at the zeros of Legendre polynomials;                are then completely generated self-consistently
thus the condition at x = 0 can be expressed as                     [Sengupta, 1988, 1995] by the discretized spectral
                                                                    approximation from the full-range adaption
                               N
                                                                                  N
 ψ(µ) = a(ν0 )φ(µ, ν0 ) +           a(νi )φε (µ, νi ) ,
                              i=1                           (68)                        bi φε (µ, νi ) = ψ+ (µ) + ψ− (µ) ,
                                                                                                                                  (69)
                                                                                  i=0
                     µ ∈ [0, 1] ,                                                           µ ∈ [−1, 1], νi ≥ 0
where ψ(µ) = Φ(0, µ) is the specified incoming
radiation incident on the boundary from the left,                                  Table 2.    Extrapolated end-point z0 .
and the half-range coefficients a(ν0 ), {a(ν)}ν∈[0,1]                                                         cz0
are to be evaluated using the W -function of
Appendix A.4. We now exploit the relative sim-                            c           N =2        N =6            N = 10     Exact
plicity of the full-range calculations by replacing                      0.2       0.78478        0.78478         0.78478    0.7851
Eq. (68) by Eq. (69) following, where the coefficients                     0.4       0.72996        0.72996         0.72996    0.7305
        N
{b(νi )}i=0 are used to distinguish the full-range co-                   0.6       0.71535        0.71536         0.71536    0.7155
efficients from the half-range ones. The significance                       0.8       0.71124        0.71124         0.71124    0.7113
of this change lies in the overwhelming simplicity                       0.9       0.71060        0.71060         0.71061    0.7106


                                            Table 3.    X(−ν) by the full-range method.

                                                                        X(−ν)

                              c        N         νi         Problem A         Problem B          Exact

                                               0.2133       0.8873091         0.8873091         0.887308
                             0.2       2
                                               0.7887       0.5826001         0.5826001         0.582500
                                               0.0338       1.3370163         1.3370163         1.337015
                                               0.1694       1.0999831         1.0999831         1.099983
                                               0.3807       0.8792321         0.8792321         0.879232
                             0.6       6
                                               0.6193       0.7215240         0.7215240         0.721524
                                               0.8306       0.6239109         0.6239109         0.623911
                                               0.9662       0.5743556         0.5743556         0.574355
                                               0.0130       1.5971784         1.5971784         1.597163
                                               0.0674       1.4245314         1.4245314         1.424532
                                               0.1603       1.2289940         1.2289940         1.228995
                                               0.2833       1.0513750         1.0513750         1.051376
                                               0.4255       0.9058140         0.9058410         0.905842
                             0.9       10
                                               0.5744       0.7934295         0.7934295         0.793430
                                               0.7167       0.7102823         0.7102823         0.710283
                                               0.8397       0.6516836         0.6516836         0.651683
                                               0.9325       0.6136514         0.6136514         0.613653
                                               0.9870       0.5933988         0.5933988         0.593399
Toward a Theory of Chaos   3201

of the discretized boundary condition Eq. (68),                          the required bj from these “negative” coefficients.
where ψ+ (µ) is by definition the incident flux ψ(µ)                       By equating these calculated bi with the exact half-
for µ ∈ [0, 1] and 0 if µ ∈ [−1, 0], while                               range expressions for a(ν) with respect to W (µ) as
                                                                        outlined in Appendix A.4, it is possible to find nu-
           N                                                            merical values of z0 and X(−ν). Thus from the sec-
                 b− φε (µ, νi ) if µ ∈ [−1, 0], νi ≥ 0
          
          
ψ− (µ) = i=0 i                                                           ond of Eq. (A.64), {X(−νi )}N is obtained with
                                                                                                            i=1
          
                                                                        biB = aiB , i = 1, . . . , N , which is then substituted
             0                  if µ ∈ [0, 1]
          
                                                                         in the second of Eq. (A.63) with X(−ν 0 ) obtained
is the emergent angular distribution out of the                          from aA (ν0 ) according to Appendix A.4, to compare
medium. Equation (69) corresponds to the full-                           the respective aiA with the calculated biA from (71).
range µ ∈ [−1, 1], νi ≥ 0 form                                           Finally the full-range coefficients of Problem A can
                                                                         be used to obtain the X(−ν) values from the sec-
                        1
b(ν0 )φ(µ, ν0 ) +           b(ν)φ(µ, ν)dν                                ond of Eqs. (A.63) and compared with the exact
                    0                                                    tabulated values as in Table 3. The tabulated val-
                                                  1                      ues of cz0 from Eq. (67) show a consistent deviation
 = ψ+ (µ) + b− (ν0 )φ(µ, ν0 ) +                       b− (ν)φ(µ, ν)dν    from our calculations of Problem A according to
                                              0
                                                                         aA (ν0 ) = − exp(−2z0 /ν0 ). Since the X(−ν) values
                                                                  (70)
                                                                         of Problem A in Table 3 also need the same b 0A as
of boundary condition (A.59) with the first and sec-                      input that was used in obtaining z0 , it is reasonable
ond terms on the right having the same interpre-                         to conclude that the “exact” numerical integration
tation as for Eq. (69). This full-range simulation                       of z0 is inaccurate to the extent displayed in Table 2.
merely states that the solution (A.58) of Eq. (A.53)                          From these numerical experiments and Fig. 18
holds for all µ ∈ [−1, 1], x ≥ 0, although it was ob-                    we may conclude that the continuous spectrum
tained, unlike in the regular full-range case, from                      [−1, 1] of the position operator µ acts as the D +
the given radiation ψ(µ) incident on the bound-                          points in generating the multifunctional Case sin-
ary at x = 0 over only half the interval µ ∈ [0, 1].                     gular eigenfunction φ(µ, ν). Its rational approxima-
To obtain the simulated full-range coefficients {b i }                     tion φε (µ, ν) in the context of the simple simulated
and {b− } of the half-range problem, we observe that
       i                                                                 full-range computations of the complex half-range
there are effectively only half the number of coef-                       exact theory of Appendix A.4, clearly demonstrates
ficients as compared to a normal full-range prob-                         the utility of graphical convergence of sequence of
lem because ν is now only over half the full inter-                      functions to multifunction. The totality of the mul-
val. This allows us to generate two sets of equations                    tifunctions φ(µ, ν) for all ν in Figs. 18(c) and 18(d)
from (70) by integrating with respect to µ ∈ [−1, 1]                     endows the problem with the character of max-
with ν in the half intervals [−1, 0] and [0, 1] to                       imal ill-posedness that is characteristic of chaos.
obtain the two sets of coefficients b− and b, re-                          This chaotic signature of the transport equation is
spectively. Accordingly we get from Eq. (69) with                        however latent as the experimental output Φ(x, µ)
j = 0, 1, . . . , N the sets of equations                                is well-behaved and regular. This important exam-
                                 N
                                                                         ple shows how nature can use hidden and complex
                   (+)
          (ψ, φj− )µ        =−         b− (φi+ , φj− )µ
                                                      (−)                chaotic substates to generate order through a pro-
                                        i
                                 i=0
                                                                         cess of superposition.
                                       N
          1
   bj =        (ψ, φj+ )(+) +
                        µ                    b− (φi+ , φj+ )µ
                                              i
                                                            (−)
                                                                         6. Does Nature Support
          Nj
                                       i=0                                  Complexity?
                                                                  (71)
                                                                         The question of this section is basic in the light of
where (φj± )N represents (φε (µ, ±νj ))N , φ0± =
             j=1                       j=1                               the theory of chaos presented above as it may be
φ(µ, ±ν0 ), the (+) (−) superscripts are used to                         reformulated to the inquiry of what makes nature
denote the integrations with respect to µ ∈ [0, 1]                       support chaoticity in the form of increasing non-
and µ ∈ [−1, 0] respectively, and (f, g) µ denotes                       injectivity of an input–output system. It is the pur-
the usual inner product in [−1, 1] with respect                          pose of this section to exploit the connection be-
to the full range weight µ. While the first set of                        tween spectral theory and the dynamics of chaos
N + 1 equations give b− , the second set produces
                       i                                                 that has been presented in the previous section.
3202   A. Sengupta

Since linear operators on finite dimensional spaces        of functions whose images under the respective L λ
do not possess continuous or residual spectra, spec-      converge to 0; recall the definition of footnote 26.
tral theory on infinite dimensional spaces essentially     This observation generalizes to the dense extension
involves limiting behavior to infinite dimensions of       Multi| (X, Y ) of Map(X, Y ) as follows. If x ∈ D +
the familiar matrix eigenvalue–eigenvector problem.       is not a fixed point of f (λ; x) = x, but there is
As always this means extensions, dense embeddings         some n ∈ N such that f n (λ; x) = x, then the limit
and completions of the finite dimensional problem          n → ∞ generates a multifunction at x as was the
that show up as generalized eigenvalues and eigen-        case with the delta function in the previous section
vectors. In its usual form, the goal of nonlinear spec-   and the various other examples that we have seen
tral theory consists [Appell et al., 2000] in the study   so far in the earlier sections.
     −1
of Tλ for nonlinear operators Tλ that satisfy more             One of the main goals of investigations on the
general continuity conditions, like differentiability      spectrum of nonlinear operators is to find a set in
and Lipschitz continuity, than simple boundedness         the complex plane that has the usual desirable prop-
that is sufficient for linear operators. The following      erties of the spectrum of a linear operator [Appell
generalization of the concept of the spectrum of a        et al., 2000]. In this case, the focus has been to find
linear operator to the nonlinear case is suggestive.      a suitable class of operators C(X) with T ∈ C(X),
For a nonlinear map, λ need not appear only in a          such that the resolvent set is expressed as
multiplying role, so that an eigenvalue equation can
                                                             ρ(T ) = {λ ∈ C : (Tλ is 1 : 1)(Cl(R(Tλ ) = X)
be written more generally as a fixed-point equation
                                                                             −1
                                                                       and (Tλ ∈ C(X) on R(Tλ ))}
                     f (λ; x) = x
                                                          with the spectrum σ(T ) being defined as the com-
with a fixed point corresponding to the eigenfunc-         plement of this set. Among the classes C(X) that
tion of a linear operator and an “eigenvalue” being       have been considered, beside spaces of continu-
the value of λ for which this fixed point appears.         ous functions C(X), are linear boundedness B(X),
The correspondence of the residual and continu-           Frechet differentiability C 1 (X), Lipschitz continu-
ous parts of the spectrum are, however, less trivial      ity Lip(X), and Granas quasiboundedness Q(x),
than for the point spectrum. This is seen from the        where Lip(X) specifically takes into account the
following two examples [Roman, 1975]. Let Ae k =          nonlinearity of T to define
λk ek , k = 1, 2, . . . be an eigenvalue equation with
ej being the jth unit vector. Then (A − λ)e k :=                                         T (x) − T (y)
                                                                   T   Lip   = supx=y                  ,
(λk − λ)ek = 0 iff λ = λk so that {λk }∞ ∈ P σ(A)                                             x−y
                                          k=1                                                                  (72)
are the only eigenvalues of A. Consider now (λ k )∞k=1                                    T (x) − T (y)
to be a sequence of real numbers that tends to                      |T |lip = inf x=y   =
                                                                                              x−y
a finite λ∗ ; for example, let A be a diagonal ma-
trix having 1/k as its diagonal entries. Then λ ∗         that are plain generalizations of the corresponding
                                                                                                 −
belongs to the continuous spectrum of A because           norms of linear operators. Plots of f λ (y) = {x ∈
(A − λ∗ )ek = (λk − λ∗ )ek with λk → λ∗ implies           D(f −λ) : (f −λ)x = y} for the functions f : R → R
that (A − λ∗ )−1 is an unbounded linear operator
                                                                        
                                                                         −1 − λx, x  −1
and λ∗ a generalized eigenvalue of A. In the second           fλa (x) = (1 − λ)x, −1 ≤ x ≤ 1
example Aek = ek+1 /(k + 1), it is not difficult to                       
                                                                          1 − λx,      1x
verify that: (a) The point spectrum of A is empty,                      
(b) The range of A is not dense because it does                          −λx,             x1
not contain e1 , and (c) A−1 is unbounded because             fλb (x) = (1 − λ)x − 1, 1 ≤ x ≤ 2
Aek → 0. Thus the generalized eigenvalue λ ∗ = 0 in                       1 − λx,          2x
                                                                        
this case belongs to the residual spectrum of A. In
                                                                             −λx
                                                                             √                x1
either case, limj→∞ ej is the corresponding general-          fλc (x) =
                                                                              x − 1 − λx      1≤x
ized eigenvector that enlarges the trivial null space
N (Lλ∗ ) of the generalized eigenvalue λ∗ . In fact                          (x − 1)2 + 1 − λx     1≤x≤2
in these two and the Dirac delta example of Sec.              fλd (x) =
                                                                             (1 − λ)x              otherwise
5 of continuous and residual spectra, the general-
ized eigenfunctions arise as the limits of a sequence         fλe (x) = tan−1 (x) − λx
−1                                         2                                                                                    −4
                                                                                                        0
                                                                                                        0                 1
                                                                                                                          1                    −4
                                                                                                                                               −4                                                             44
                                                                                                                                                          Toward a Theory of Chaos
                                                                                                                                                                      −5                                  3203
                                                                                                                                                                                (a)
                                                                                                            −5                                                                 −1
                                                5
                                                5
                                                                                                            −5 4
                                                                                                               4
                                                                                                             (a)
                                                                                                                                                                              −1 10
                                                                                                                                                                                (b)
                                                                                                            (a)                                                                (b)
                                                                                                             10
                                                                                                             10                                                              33
                                                          0
                                                          0                                                                                                                                                                    2
                                                4                                                  11                00
                              1                                                                                                                             22          1                   00
                                                                                                                                                                   0.5 1                               0
                      2           1                   0                                                                                                                                                −1                 3
                                                                      −1
                                                                      −1                 22         0.5
                                                                                                    0.5                            00
                                                                                                                                    −1
                                                                                                                                   −1
                                                                                                                                                                                                      −1
                                                                                                                                                     33
                                                                           3
                                                                           3
    −1       −3       2                                           −1                                                                                        1
                                                                  2
                                                                  2                     1
                                                                                        1                                                             1.5
         3        −1                                                                                                                                                                                  −0.5
                                       0
                                       0                      1
                                                              1                   1.5                                                                  2                                                            −4
                                                                                −41.5
                                                                                                                                                                                                      −1
2                                                                               −4                                             −0.5
                                                                                                                               −0.5       44
                                                                                   2
                                                                                   2                                                           −4
                                                                                                                                               −4                                                             55
                                                                                                                                −1
                                                                                                                                −1              −3                                                             3
             −4                                                            4
                                                                               −3
                                                                               −3                                                        33
                                             −5
                                            −5                                                                −1
                                                                                                             −1                                                                −1
                                             (a)
                                             (a)                                                              (b)
                                                                                                               (b)                                                             (c)
                                            (a)
                                              10
                                              10                                                            (b)
                                                                                                            −1
                                                                                                            −1
                                                                                                            33                                                               (c)
                                                                                                                                                                            −1
                                                                                                                                                                            −1
                                             −1                                                              (c)
                                                                                                            (c)                                                                (d)
                                                                                                                                                                                10
                                                                                                                                                                              (d)
                                              (b)
                                            3                                                 22              10
                                                                                                             11
                                                                                                              10          00                                                   10
                                                                                                                                                                              10
                                                                                                                                                                                                      0                       1.5
                                                                                                                                    −1
                                                                                                                                     −1
                              2       0.5
                                      0.5   1                 0       0
                                                                      0                                                        0
                                                                                                                               0                            0.5
                                                                                                                                                            1.5
                                                                                                                                                           1.5         11        0.5
                                                                                                                                                                                0.5         00     −0.5
                                                                                                                                                                                                  −0.5                        2
                                                                                    33
                                                                      −1               0.5
                                                                                      0.5                                                             1 22
    0                                                                                                                                                                                               −1
                                                                                                                                                                                                   −1
                  3                                                                 1
                                                                                    1                                                                 2                                            −0.5
                          1                                                                                                                    −2                                                              2    −10
                                                                                    2
                                                                                    2                                          −0.5
                                                                                                                               −0.5                                                                       2
                                                                                                                                                     −0.5
                  1.5                                             −0.5
                                                                  −0.5         −2
                                                                               −2                                                        22   −10
                                                                                                                                              −10                                                         1    10
                                                                                                                                                                                                              10
                                                                                  −0.5                                             2
                                                                                                                                   2       55
                   2                                                            −4
                                                                                −4−0.5
                                                                  −1
                                                                   −1                                                              1
                                                                                                                                   1                                                                                      −1
−0.5                                                                                                                                                        0                                         0.5
             −3
             −3
             −4                                                            3
                                                                           3
                                                                           5                                                                         −1
                                                                                                                                                     −1                                          22                       −0.5
 −1                                                                                      0
                                                                                         0                                     0.5
                                                                                                                               0.5
                                             −1                                                             −1                                       −0.5
                                                                                                                                                     −0.5         00          0.5
                                                                                                                                                                             0.5       11         1.5
                                                                                                                                                                                                 1.5
         3                                   −1                                                              −1
                                             (c)
                                              (c)                                                             (d)
                                                                                                               (d)                                                            −10
                                                                                                                                                                                (e)
                                              10
                                            −110                                                            −10
                                                                                                            −1010
                                                                                                                10
                                                                                                                                                                              −10
                                                                                                                                                                             −10
                                              (d)                                                            (e)
                                                                                                             (e)                                                              (f)
                                                                                                                                                                               (f)
                                            (d)                                                             (e)                                                              (f)
                                                10                0
                                                                  0               1.5
                                                                                   1.5   11   0.5
                                                                                               0.5   00 −0.5
                                                                                                          −0.5                                                                                            2
                          0.5
                          0.5                             Fig. 19. Inverses of fλ =2                      2
                                                                                  2 f − λ. The λ-values are2shown on the graphs.
0                         1.5          1        0.5        0 −0.5
                                                                                                          −1
                                                                                                           −1
                  1
                  1
                  2 2
                  2               √    −0.5
                                        −1
                                       −0.5
                            1 − 2 −x − λx,2 x−10−1
                                                                                                                 the complement of the resolvent set, is more diffi-
                           
             −2
             −2                             2    −10                                                                                10
                                                                                                                                     10
                                        22                                                                        cult to find. Here the convenient characterization of
                 −0.5      
                  fλf (x) = (1 − λ)x, 1        −1 ≤ x ≤ 1
−0.5             −0.5
         2                               1  10
             −10            √                                                                                    the resolvent of a continuous linear operator as the
    2                                                                                                                       2
                             2 x − 1 − λx,     1  x −1
                           
    1                 0   0            0.5           −1
                                                                  0.5                                             set of all 2sufficiently large λ that satisfy |λ|  M is
                                                                                                        0    0.5     11    1.5
                                                                                                              0.5 of little significance as, unlike for a linear operator,
             taken from [Appell et al., 2000] are shown −0.5 0                                                              1.5
                                                                                    −0.5
                  −1                     2              in Fig. 19.
0.5
             It is easy 0 verify that 1.5 Lipschitz and linear
                 −0.5
                        to −10 1
                            0.5
                                         the                                                                 −10
                                                                                                                  the non-existence of an inverse is not just due to
             upper and lower bounds of these maps are as in                                                                    −1 (0)} which happens to be the only way
                                                                                                               (f)the set {f
                            −10                                                                               −10
                             (e)
                              (e)                                                                               (f)
             Table 4.        −10                                                                                  a linear map can fail to be injective. Thus the map
                  The point spectrum defined by
                              (f)          2                                                                      defined piecewise as α + 2(1 − α)x for 0 ≤ x  1/2
                                                                      2
               P σ(f ) = {λ ∈ C : (f − λ)x = 0 for some x = 0}                                                    and 2(1 − x) for 1/2 ≤ x ≤ 1, with 0  α  1, is
    2                                                                                                             not invertible on its range although {f − (0)} = 1.
             is the simplest to calculate. Because of the spe-
                                                                                                                  Comparing Fig. 19 and Table 4, it is seen that in
             cial role played by the zero element 0 in generating
                                                                                                                  cases (b)–(d), the intervals [|f |b , f B ] are subsets
             the point spectrum in the linear case, the bounds
             m x ≤ Lx ≤ M x together with Lx = λx                                                                 of the λ-values for which the respective maps are
             imply Cl(P σ(L)) = [ L b , L B ] — where the                                                         not injective; this is to be compared with (a), (e)
             subscripts denote the lower and upper bounds in                                                      and (f) where the two sets are the same. Thus the
             Eq. (72) and which sometimes is taken to be a de-                                                    linear bounds are not good indicators of the unique-
             scriptor of the point spectrum of a nonlinear op-                                                    ness properties of solution of nonlinear equations for
             erator — as can be seen in Table 5 and verified                                                       which the Lipschitzian bounds are seen to be more
             from Fig. 19. The remainder of the spectrum, as                                                      appropriate.
3204    A. Sengupta

        Table 4.    Bounds on the functions of Fig. 19.                              Thus apart from multifunctions, λ ∈ σ(f ) also
                                                                                generates functions on the boundary of functional
Function            |f |b               f           |f |lip       f
                                             B                        Lip       and non-functional relations in Multi(X, T ). While
  fa                  0                  1            0               1         it is possible to classify the spectrum into point,
  fb                  0                 1/2           0               1         continuous and residual subsets, as in the linear
   fc               0                   1/2           0            ∞
                                                                                case, it is more meaningful for nonlinear opera-
                  √                                                             tors to consider λ as being either in the bound-
  fd            2( 2 − 1)               ∞             0               2
                                                                                ary spectrum Bdy(σ(f )) or in the interior spectrum
   fe                 0                  1            0               1
                                                                                Int(σ(f )), depending on whether or not the mul-
   ff                 0                  1            0               1
                                                                                tifunction f (λ; ·)− arises as the graphical limit of
                                                                                a net of functions in either ρ(f ) or Rσ(f ). This
                                                                                is suggested by the spectra arising from the sec-
        Table 5. Lipschitzian and point spectra                                 ond row of Table 1 (injective Lλ and discontinu-
        of the functions of Fig. 19.                                            ous L−1 ) that lies sandwiched in the λ-plane be-
                                                                                       λ
                                                                                tween the two components arising from the first
        Functions           σLip (f )               P σ(f )
                                                                                and third rows, see [Naylor  Sell, 1971, Sec. 6.6],
           fa                [0, 1]                  (0, 1]                     for example. According to this simple scheme, the
           fb                [0, 1]                 [0, 1/2]
                                                                                spectral set is a closed set with its boundary and
                                                                                interior belonging to Bdy(σ(f )) and Int(σ(f )), re-
           fc               [0, ∞)                  [0, 1/2]
                                                    √                           spectively. Table 6 shows this division for the ex-
           fd                [0, 2]              [2( 2 − 1), 1]
                                                                                amples in Fig. 19. Because 0 is no more significant
           fe                [0, 1]                  (0, 1)                     than any other point in the domain of a nonlin-
           ff                [0, 1]                  (0, 1)                     ear map in inducing non-injectivity, the division of
                                                                                the spectrum into the traditional sets would be as
                                                                                shown in Table 6; compare also with the conven-
     In view of the above, we may draw the follow-                              tional linear point spectrum of Table 5. In this non-
ing conclusions. If we choose to work in the space                              linear classification, the point spectrum consists of
of multifunctions Multi(X, T ), with T the topology                             any λ for which the inverse f (λ; ·)− is set-valued,
of pointwise biconvergence, when all functional re-                             irrespective of whether this is produced at 0 or not,
lations are (multi)invertible on their ranges, we may                           while the continuous and residual spectra together
make the following definition for the net of functions                           comprise the boundary spectrum. Thus a λ can be
f (λ; x) satisfying f (λ; x) = x.                                               both at the point and the continuous or residual
                                                                                spectra which need not be disjoint. The continuous
Definition 6.1. Let f (λ; ·) ∈ Multi(X, T ) be a                                 and residual spectra are included in the boundary
function. The resolvent set of f is given by                                    spectrum which may also contain parts of the point
                                                                                spectrum.
         ρ(f ) = {λ : (f (λ; ·)−1 ∈ Map(X, T ))
                 ∧(Cl(R(f (λ; ·)) = X)} ,                                       Example 6.1. To see how these concepts apply
                                                                                to linear mappings, consider the equation (D −
and any λ not in ρ is in the spectrum of f .                                    λ)y(x) = r(x) where D = d/dx is the differential

                              Table 6. Nonlinear spectra of functions of Fig. 19. Compare the present
                              point spectra with the usual linear spectra of Table 5.

                              Function           Int(σ(f ))       Bdy(σ(f ))        P σ(f )   Cσ(f )    Rσ(f )

                                  fa               (0, 1)             {0, 1}         [0, 1]    {1}       {0}
                                  fb               (0, 1)             {0, 1}         [0, 1]    {1}       {0}
                                  fc               (0, ∞)                 {0}       [0, ∞)     {0}        ∅
                                  fd               (0, 2)             {0, 2}        (0, 2)    {0, 2}      ∅
                                  fe               (0, 1)             {0, 1}        (0, 1)     {1}       {0}
                                  ff               (0, 1)             {0, 1}        (0, 1)    {0, 1}      ∅
Toward a Theory of Chaos   3205

operator on L2 [0, ∞), and let λ be real. For λ = 0,     by the graphical convergence of a net of resolvent
the unique solution of this equation in L 2 [0, ∞), is   functions while the multifunctions in the interior of
                         x
                                                         the spectral set evolve graphically independent of
        λx
       e
               y(0) +      e−λx r(x )dx , λ  0         the functions in the resolvent. The chaotic states
                                                         forming the boundary of the functional and multi-
       
                        0
y(x) =                    ∞                              functional subsets of Multi(X) marks the transition
                             e−λx r(x )dx
        λx
       e       y(0) −                          λ0
       
                                                        from the less efficient functional state to the more
                          x
                                                         efficient multifunctional one.
showing that for λ  0 the inverse is functional so           These arguments also suggest the following.
that λ ∈ (0, ∞) belongs to the resolvent of D. How-      The countably many outputs arising from the non-
ever, when λ  0, apart from the y = 0 solution          injectivity of f (λ; ·) corresponding to a given input
(since we are dealing with a linear problem, only        can be interpreted to define complexity because in
r = 0 is to be considered), eλx is also in L2 [0, ∞)     a nonlinear system each of these possibilities con-
so that all such λ are in the point spectrum of D.       stitute an experimental result in itself that may not
For λ = 0 and r = 0, the two solutions are not nec-      be combined in any definite predtermined manner.
                       ∞
essarily equal unless 0 r(x) = 0, so that the range      This is in sharp contrast to linear systems where
R(D − I) is a subspace of L2 [0, ∞). To complete         a linear combination, governed by the initial con-
the problem, it is possible to show [Naylor  Sell,      ditions, always generate a unique end result; recall
1971] that 0 ∈ Cσ(D), see Example 2.2; hence the         also the combination offered by the singular gen-
continuous spectrum forms at the boundary of the         eralized eigenfunctions of neutron transport the-
functional solution for the resolvent-λ and the mul-     ory. This multiplicity of possibilities that have no
tifunctional solution for the point spectrum. With       definite combinatorial property is the basis of the
a slight variation of problem to y(0) = 0, all λ  0     diversity of nature, and is possibly responsible for
are in the resolvent set, while λ  0 the inverse is     Feigenbaum’s “historical prejudice”, [Feigenbaum,
                                     ∞
bounded but must satisfy y(0) = 0 e−λx r(x)dx =          1992], see Prelude 2. Thus order represented by
0 so that Cl(R(D−λ)) = L2 [0, ∞). Hence λ  0 be-        the functional resolvent passes over to complexity
long to the residual spectrum. The decomposition         of the countably multifunctional interior spectrum
of the complex λ-plane for these and some other          via the uncountably multifunctional boundary that
linear spectral problems taken from [Naylor  Sell,      is a prerequisite for chaos. We may now strengthen
1971] is shown in Fig. 20. In all cases, the spectrum    our hypothesis offered at the end of the previous
due to the second row of Table 1 acts as a boundary      section in terms of the examples of Figs. 19 and
between that arising from the first and third rows,       20, that nature uses chaoticity as an intermediate
which justifies our division of the spectrum for a        step to the attainment of states that would other-
nonlinear operator into the interior and boundary        wise be inaccessible to it. Well-posedness of a sys-
components. Compare with Example 2.2.                    tem is an extremely inefficient way of expressing a
                                                         multitude of possibilities as this requires a different
    From the basic representation of the resolvent       input for every possible output. Nature chooses to
operator (1 − f )−1                                      express its myriad manifestations through the mul-
             1 + f + f2 + · · · + fi + · · ·             tifunctional route leading either to averaging as in
                                                         the delta function case or to a countable set of well-
in Multi(X), if the iterates of f converge to a multi-   defined states, as in the examples of Fig. 19 corre-
function for some λ, then that λ must be in the spec-    sponding to the interior spectrum. Of course it is no
trum of f , which means that the control parameter       distraction that the multifunctional states arise re-
                                                                                     −
of a chaotic dynamical system is in its spectrum. Of     spectively from fλ and fλ in these examples as f is
course, the series can sum to a multi even otherwise:    a function on X that is under the influence of both
take fλ (x) to be identically x with λ = 1, for exam-    f and its inverse. The functional resolvent is, for all
ple, to get 1 ∈ P σ(f ). A comparison of Tables 1 and    practical purposes, only a tool in this structure of
5 reveal that in case (d), for example, 0 and 2 belong   nature.
                                                  −1
to the Lipschtiz spectrum because although f d is             The equation f (x) = y is typically an input–
not Lipschitz continuous, f Lip = 2. It should also      output system in which the inverse images at a func-
be noted that the boundary between the functional        tional value y0 represents a set of input parameters
resolvent and multifunctional spectral set is formed     leading to the same experimental output y 0 ; this
3206    A. Sengupta


        Resolvent
         Resolvent
           Resolvent                                    Resolvent
                                                         Resolvent
                                                          Resolvent                                     Resolvent
                                                                                                        Resolvent
                                                                                                          Resolvent
        Resolvent
           set set
         Resolvent
            set                                         Resolvent
                                                           set set
                                                         Resolvent
                                                            set                                         Resolvent
                                                                                                           setset
                                                                                                         Resolvent
                                                                                                            set
           set
             set                                           set
                                                            set                                            set
                                                                                                            set




                                  Continuous Residual
                                   Continuous Residual
                                     Continuous Residual                               Continuous Point
                                                                                    Continuous
                                                                                     Continuous     Point
                                                                                                      Point                           Continuous
                                                                                                                                      Continuous
                                                                                                                                         Continuous
                                  Continuous spectrum
                                   Continuous Residual
                                   spectrum
                                    spectrum    Residual
                                      spectrum spectrum
                                                  spectrum                           spectrum
                                                                                      spectrum
                                                                                        spectrum   Point
                                                                                    Continuous spectrum
                                                                                                    Point
                                                                                     Continuous spectrum
                                                                                                    spectrum                          Continuous
                                                                                                                                       Continuous
                                                                                                                                       spectrum
                                                                                                                                        spectrum
                                                                                                                                          spectrum
                                   spectrum
                                    spectrum   spectrum
                                                spectrum                             spectrum
                                                                                      spectrum   spectrum
                                                                                                  spectrum                             spectrum
                                                                                                                                        spectrum
                            (a)                                               (b)                                                (c)
    λ-plane
 λ-plane
λ-plane
λ-plane
 λ-plane
           Resolvent
         Resolvent
        Resolvent                                         Resolvent
                                                         Resolvent
                                                        Resolvent                      Residual Point
                                                                                     Residual
                                                                                    Residual        Point
                                                                                                  Point                                  Resolvent
                                                                                                                                       Resolvent
                                                                                                                                       Resolvent
        Resolvent
         Resolvent
           set set
            set                                         Resolvent
                                                         Resolvent
                                                           set set
                                                            set                      Residual
                                                                                      Residual   Point
                                                                                                  Point
                                                                                       spectrum spectrum
                                                                                                  spectrum
                                                                                     spectrum spectrum
                                                                                    spectrum                    § !
                                                                                                                                     Resolvent
                                                                                                                                        Resolvent
                                                                                                                                          £ !% ¦#
                                                                                                                                            $   £
                                                                                                                                             set
                                                                                                                                           set
                                                                                                                                          set                §
           set
             set                                           set
                                                            set                     spectrum
                                                                                     spectrum spectrum
                                                                                                spectrum      ¨¦¤¢ 
                                                                                                               © § ¥ £ ¡                set
                                                                                                                                           set   )¢ 
                                                                                                                                                § £



                                     Continuous
                                   Continuous
                                  Continuous                                           Continuous
                                                                                     Continuous
                                                                                    Continuous                                            Continuous
                                                                                                                                       Continuous
                                                                                                                                       Continuous
                                                                                                                                          ¤   ! § % (
                                  Continuous
                                   Continuous
                                      spectrum
                                    spectrum
                                   spectrum                                         Continuous
                                                                                     Continuous
                                                                                        spectrum
                                                                                      spectrum
                                                                                     spectrum                                          Continuous
                                                                                                                                        Continuous
                                                                                                                                           spectrum
                                                                                                                                         spectrum
                                                                                                                                        spectrum
                                                                                                                                          '¦¤¢ 
                                                                                                                                           © § ¥ £ ¡
                                   spectrum
                                    spectrum                                         spectrum
                                                                                      spectrum                                          spectrum
                                                                                                                                         spectrum

                                                                                                            £ 54$ ¡ 20
                                                                                                               3    1


                            (d)                                               (e)                                               (f)

Fig. 20. Spectra of some linear operators in the complex λ-plane. (a) Left shift (. . . , x−1 , x0 , x1 , . . .) → (. . . x0 , x1 , x2 , . . .)
on l2 (−∞, ∞), (b) Right shift (x0 , x1 , x2 , . . .) → (0, x0 , x1 , . . .) on l2 [0, ∞), (c) Left shift (x0 , x1 , x2 , . . .) → (x1 , x2 , x3 , . . .)
on l2 [0, ∞) of sequence spaces, and (d) d/dx on L2 (−∞, ∞) (e) d/dx on L2 [0, ∞) with y(0) = 0 and (f) d/dx on L2 [0, ∞).
The residual spectrum in (b) and (e) arise from block (3–3) in Table 1, i.e. Lλ is one-to-one and L−1 is bounded on non-λ
dense domains in l2 [0, ∞) and L2 [0, ∞), respectively. The continuous spectrum therefore marks the boundary between two
functional states, as in (a) and (e), now with dense and non-dense domains of the inverse operator.


is stability characterized by a complete insensitiv-                           is larger than a functional state represented by the
ity of the output to changes in input. On the other                            singleton {f (x0 )}.
hand, a continuous multifunction at x 0 is a signal
for a hypersensitivity to input because the output,                            Epilogue
which is a definite experimental quantity, is a choice
                                                                               The most passionate advocates of the new science
from the possibly infinite set {f (x0 )} made by a
                                                                               go so far as to say that twentieth-century science
choice function which represents the experiment at                             will be remembered for just three things: relativity,
that particular point in time. Since there will always                         quantum mechanics and chaos. Chaos, they contend,
be finite differences in the experimental parameters                             has become the century’s third great revolution in
when an experiment is repeated, the choice function                            the physical sciences. Like the first two revolutions,
(that is the experimental output) will select a point                          chaos cuts away at the tenets of Newton’s physics. As
from {f (x0 )} that is representative of that experi-                          one physicist put it: “Relativity eliminated the New-
ment and which need not bear any definite relation                              tonian illusion of absolute space and time; quantum
to the previous values; this is instability and sig-                           theory eliminated the Newtonian dream of a con-
nals sensitivity to initial conditions. Such a state is                        trollable measurement process; and chaos eliminates
of high entropy as the number of available states                              the Laplacian fantasy of deterministic predictability.”
fC ({f (x0 )}) — where fC is the choice function —                             Of the three, the revolution in chaos applies to the
                                                           1
                                                        11
                                                        11
Toward a Theory of Chaos   3207

universe we see and touch, to objects at human scale.        Goldenfeld, N.  Kadanoff, L. P. [1999] “Simple lessons
. . . There has long been a feeling, not always expressed      from complexity,” Science 284, 87–89.
openly, that theoretical physics has strayed far from        Korevaar, J. [1968] Mathematical Methods, Vol. 1 (Aca-
human intuition about the world. Whether this will             demic Press, NY).
prove to be fruitful heresy, or just plain heresy, no one    Naylor, A. W.  Sell, G. R. [1971] Linear Operator
                                                               Theory is Engineering and Science Holt (Rinehart and
knows. But some of those who thought that physics
                                                               Winston, NY).
might be working its way into a corner now look to
                                                             Peitgen, H.-O., Jurgens, H.  Saupe, D. [1992] Chaos
chaos as a way out.                                            and Fractals: New Frontiers of Science (Springer-
                                          [Gleick, 1987]       Verlag, NY).
                                                             Robinson, C. [1999] Dynamical Systems: Stability, Sym-
                                                               bolic Dynamics and Chaos (CRC Press LLC, Boca
Acknowledgments                                                Raton).
It is a pleasure to thank the referees for recom-            Roman, P. [1975] Some Modern Mathematics for Physi-
mending an enlarged Tutorial and Review revision               cists and other Outsiders (Pergammon Press, NY).
of the original submission Graphical Convergence,            Sengupta, A. [1995a] “A discretized spectral approxima-
Chaos and Complexity, and Professor Leon O. Chua               tion in neutron transport theory. Some numerical con-
                                                               siderations,” J. Stat. Phys. 51, 657–676.
for suggesting a pedagogically self-contained, jar-
                                                             Sengupta, A. [1995b] “Full range solution of half-space
gonless version accessible to a wider audience for
                                                               neutron transport problem,” ZAMP 46, 40–60.
the present form of the paper. Financial assis-              Sengupta, A. [1997] “Multifunction and generalized in-
tance during the initial stages of this work from              verse,” J. Inverse and Ill-Posed Problems 5, 265–285.
the National Board for Higher Mathematics is also            Sengupta, A.  Ray, G. G. [2000] “A multifunctional ex-
acknowledged.                                                  tension of function spaces: Chaotic systems are maxi-
                                                               mally ill-posed,” J. Inverse and Ill-Posed Problems 8,
                                                               232–353.
References                                                   Stuart, A. M.  Humphries, A. R. [1996] Dynamical
Alligood, K. T., Sauer, T. D.  Yorke, J. A. [1997] Chaos,     Systems and Numerical Analysis (Cambridge Univer-
   An Introduction to Dynamical Systems (Springer-             sity Press).
   Verlag, NY).                                              Tikhonov, A. N.  Arsenin, V. Y. [1977] Solutions of Ill-
Appell, J., DePascale, E.  Vignoli, A. [2000] “A com-         Posed Problems (V. H. Winston, Washington D.C.).
   parison of different spectra for nonlinear operators,”     Waldrop, M. M. [1992] Complexity: The Emerging Sci-
   Nonlin. Anal. 40, 73–90.                                    ence at the Edge of Order and Chaos (Simon and
Brown, R.  Chua, L. O. [1996] “Clarifying chaos:              Schuster).
   Examples and counterexamples,” Int. J. Bifurcation        Willard, S. [1970] General Topology (Addison-Wesley,
   and Chaos 6, 219–249.                                       Reading, MA).
Campbell, S. I.  Mayer, C. D. [1979] Generalized            Williams, M. M. R. [1971] Mathematical Methods of
   Inverses of Linear Transformations (Pitman Publish-         Particle Transport Theory (Butterworths, London).
   ing Ltd., London).
Case, K. M.  Zweifel, P. F. [1967] Linear Transport
   Theory (Addison-Wesley, MA).                              Appendix
de Souza, H. G. [1997] “Opening address,” in The Im-
                                                             This Appendix gives a brief overview of some as-
   pact of Chaos on Science and Society, eds. Grebogi,
   C.  Yorke, J. A. (United Nations University Press,
                                                             pects of topology that are necessary for a proper
   Tokyo), pp. 384–386.                                      understanding of the concepts introduced in this
Devaney, R. L. [1989] An Introduction to Chaotic Dy-         work.
   namical Systems (Addison-Wesley, CA).
Falconer, K. [1990] Fractal Geometry (John Wiley,
   Chichester).                                              A.1.    Convergence in Topological
Feigenbaum, M. [1992] “Foreword,” Chaos and Fractals:                Spaces: Sequence, Net and
   New Frontiers of Science (Springer-Verlag, NY),                   Filter
   pp. 1–7.
Gallagher, R.  Appenzeller, T. [1999] “Beyond reduc-        In the theory of convergence in topological spaces,
   tionism,” Science 284, p. 79.                             countability plays an important role. To understand
Gleick, J. [1987] Chaos: The Amazing Science of the Un-      the significance of this concept, some preliminaries
   predictable (Viking, NY).                                 are needed.
3208    A. Sengupta

     The notion of a basis, or base, is a familiar one     determines reciprocally the topology U as
in analysis: a base is a subcollection of a set which                                        
may be used to construct, in a specified manner, any
                                                                                             
                                                                  U = U ⊆X :U =             B .            (A.4)
element of the set. This simplifies the statement of                                          
                                                                                        B∈T B
a problem since a smaller number of elements of
the base can be used to generate the larger class          This means that the topology on X can be recon-
of every element of the set. This philosophy finds          structed from the base by taking all possible unions
application in topological spaces as follows.              of members of the base, and a collection of subsets
     Among the three properties (N1) − (N3) of the         of a set X is a topological base iff Eq. (A.4) of arbi-
neighborhood system Nx of Tutorial 4, (N1) and             trary unions of elements of T B generates a topology
(N2) are basic in the sense that the resulting sub-        on X. This topology, which is the coarsest (that is
collection of Nx can be used to generate the full          the smallest) that contains T B, is obviously closed
system by applying (N3); this basic neighborhood           under finite intersections. Since the open set Int(N )
system, or neighborhood (local ) base B x at x, is char-   is a neighborhood of x whenever N is, Eq. (A.2)
acterized by                                               and the definition Eq. (17) of Nx implies that the
(NB1) x belongs to each member B of Bx .                   open neighborhood system of any point in a topo-
                                                           logical space is an example of a neighborhood base
(NB2) The intersection of any two members of B x
                                                           at that point, an observation that has often led, to-
contains another member of Bx : B1 , B2 ∈ Bx ⇒
                                                           gether with Eq. (A.3), to the use of the term “neigh-
(∃B ∈ Bx : B ⊆ B1 B2 ).
                                                           borhood” as a synonym for “non-empty open set”.
                                                           The distinction between the two however is signifi-
       Formally, compare Eq. (18),
                                                           cant as neighborhoods need not necessarily be open
Definition A.1.1. A neighborhood (local) base B x           sets; thus while not necessary, it is clearly sufficient
at x in a topological space (X, U) is a subcollection      for the local basic sets B to be open in Eqs. (A.1)
of the neighborhood system Nx having the prop-             and (A.2). If Eq. (A.2) holds for every x ∈ N , then
erty that each N ∈ Nx contains some member of              the resulting Nx reduces to the topology induced by
Bx . Thus                                                  the open basic neighborhood system B x as given by
                                                           Eq. (18).
       def
 Bx = {B ∈ Nx : x ∈ B ⊆ N for each N ∈ Nx }                     In order to check if a collection of subsets T B
                                        (A.1)              of X qualifies to be a basis, it is not necessary to
                                                           verify properties (T1)–(T3) of Tutorial 4 for the
determines the full neighborhood system                    class (A.4) generated by it because of the proper-
  Nx = {N ⊆ X : x ∈ B ⊆ N for some B ∈ Bx }                ties (TB1) and (TB2) below whose strong affinity to
                                        (A.2)              (NB1) and (NB2) is formalized in Theorem A.1.1.

reciprocally as all supersets of the basic elements.       Theorem A.1.1. A collection      TB   of subsets of X
                                                           is a topological basis on X iff
     The entire neighborhood system Nx , which is
                                                           (TB1) X = B ∈T B B. Thus each x ∈ X must be-
recovered from the base by forming all supersets of
                                                           long to some B ∈ T B which implies the existence
the basic neighborhoods, is trivially a local base at
                                                           of a local base at each point x ∈ X.
x; non-trivial examples are given below.
     The second example of a base, consisting as           (TB2) The intersection of any two members B 1 and
usual of a subcollection of a given collection, is the     B2 of T B with x ∈ B1 B2 contains another mem-
topological base T B that allows the specification of       ber of T B: (B1 , B2 ∈ T B) ∧ (x ∈ B1 B2 ) ⇒ (∃B ∈
the topology on a set X in terms of a smaller col-         T B : x ∈ B ⊆ B1     B2 ).
lection of open sets.                                      This theorem, together with Eq. (A.4) ensures
Definition A.1.2. A base T B in a topological space         that a given collection of subsets of a set X sat-
(X, U) is a subcollection of the topology U having         isfying (TB1) and (TB2) induces some topology
the property that each U ∈ U contains some mem-            on X; compared to this is the result that any
ber of T B. Thus                                           collection of subsets of a set X is a subbasis
                                                           for some topology on X. If X, however, already
        def
 TB     = {B ∈ U : B ⊆ U for each U ∈ U}          (A.3)    has a topology U imposed on it, then Eq. (A.3)
Toward a Theory of Chaos    3209

must also be satisfied in order that the topol-             R2 . Of course, the entire neighborhood system at
ogy generated by T B is indeed U. The next the-            any point of a topological space is itself a (less use-
orem connects the two types of bases of Defini-             ful) local base at that point. By Theorem A.1.2,
tions A.1.1 and A.1.2 by asserting that although           Bε (x; d), Dε (x; d), ε  0, Bq (x; d), Q q  0 and
a local base of a space need not consist of open           B1/n (x; d), n ∈ Z+ , for all x ∈ X are examples of
sets and a topological base need not have any ref-         bases in a metrizable space with topology induced
erence to a point of X, any subcollection of the           by a metric d.
base containing a point is a local base at that
point.                                                          In terms of local bases and bases, it is now pos-
                                                           sible to formulate the notions of first and second
Theorem A.1.2. A collection of open sets    T B is         countability as follows.
a base for a topological space (X, U) iff for each
x ∈ X, the subcollection                                   Definition A.1.3. A topological space is first
                                                           countable if each x ∈ X has some countable neigh-
         Bx = {B ∈ U : x ∈ B ∈ T B}                (A.5)   borhood base, and is second countable if it has a
of basic sets containing x is a local base at x.           countable base.

                                                                Every metrizable space (X, d) is first countable
Proof. Necessity. Let   TB  be a base of (X, U) and
                                                           as both {B(x, q)}Q q0 and {B(x, 1/n)}n∈Z+ are
N be a neighborhood of x, so that x ∈ U ⊆ N for
                                                           examples of countable neighborhood bases at any
some open set U = B ∈ T B B and basic open sets
                                                           x ∈ (X, d); hence Rn is first countable. It should be
B. Hence x ∈ B ⊆ N shows, from Eq. (A.1), that
                                                           clear that although every second countable space
B ∈ Bx is a local basic set at x.
                                                           is first countable, only a countable first countable
     Sufficiency. If U is an open set of X contain-
                                                           space can be second countable, and a common ex-
ing x, then the definition of local base Eq. (A.1)
                                                           ample of an uncountable first countable space that
requires x ∈ Bx ⊆ U for some subcollection of basic
                                                           is also second countable is provided by R n . Metriz-
sets Bx in Bx ; hence U = x∈U Bx . By Eq. (A.4)
                                                           able spaces need not be second countable: any un-
therefore, T B is a topological base for X.
                                                           countable set having the discrete topology is as an
                                                           example.
   Because the basic sets are open, (TB2) of
Theorem A.1.1 leads to the following physically            Example A.1.2. The following is an important ex-
appealing paraphrase of Theorem A.1.2.                     ample of a space that is not first countable as it is
                                                           needed for our pointwise biconvergence of Sec. 3.
Corollary.   A collection T B of open sets of (X, U)       Let Map(X, Y ) be the set of all functions between
is a topological base that generates U iff for each         the uncountable spaces (X, U) and (Y, V). Given
open set U of X and each x ∈ U there is an open            any integer I ≥ 1, and any finite collection of points
set B ∈ T B such that x ∈ B ⊆ U ; that is iff               (xi )I of X and of open sets (Vi )I in Y , let
                                                                i=1                          i=1
      x ∈ U ∈ U ⇒ (∃B ∈ T B : x ∈ B ⊆ U ) .                B((xi )I ; (Vi )I ) = {g ∈ Map(X, Y ) :
                                                                  i=1      i=1
Example A.1.1. Some examples of local bases in R                                 (g(xi ) ∈ Vi )(i = 1, 2, . . . , I)}
are intervals of the type (x−ε, x+ε), [x−ε, x+ε] for                                                           (A.6)
real ε, (x−q, x+q) for rational q, (x−1/n, x+1/n)
for n ∈ Z+ , while for a metrizable space with the         be the functions in Map(X, Y ) whose graphs pass
topology induced by a metric d, each of the follow-        through each of the sets (Vi )I             I
                                                                                           i=1 at (xi )i=1 , and
ing is a local base at x ∈ X: Bε (x; d) := {y ∈ X :        let T B be the collection of all such subsets of
d(x, y)  ε} and Dε (x; d) := {y ∈ X : d(x, y) ≤ ε}        Map(X, Y ) for every choice of I, (xi )I , and
                                                                                                       i=1
for ε  0, Bq (x; d) for Q q  0 and B1/n (x; d)           (Vi )I . The existence of a unique topology T — the
                                                                i=1
for n ∈ Z+ . In R2 , two neighborhood bases at any         topology of pointwise convergence on Map(X, Y ) —
x ∈ R2 are the disks centered at x and the set             that is generated by the open sets B of the collec-
of all squares at x with sides parallel to the axes.       tion T B now follows because
Although these bases have no elements in common,           (TB1) is satisfied: For any f ∈ Map(X, Y ) there
they are nevertheless equivalent in the sense that         must be some x ∈ X and a corresponding V ⊆ Y
they both generate the same (usual) topology in            such that f (x) ∈ V , and
3210    A. Sengupta

(TB2) is satisfied because                                the space (X, U) is not first countable (and as seen
                                                         above this is not a rare situation), it is not diffi-
       B((si )I ; (Vi )I )
              i=1      i=1   B((tj )J ; (Wj )J )
                                    j=1      j=1         cult to realize that sequences are inadequate to de-
          = B((si )I , (tj )J ; (Vi )I , (Wj )J )
                   i=1      j=1      i=1      j=1
                                                         scribe convergence in X simply because it can have
                                                         only countably many values whereas the space may
implies that a function simultaneously belonging to      require uncountably many neighborhoods to com-
the two open sets on the left must pass through each     pletely define the neighborhood system at a point.
of the points defining the open set on the right.         The resulting uncountable generalizations of a se-
      We now demonstrate that (Map(X, Y ), T ) is        quence in the form of nets and filters is achieved
not first countable by verifying that it is not           through a corresponding generalization of the index
possible to have a countable local base at any           set N to the directed set D.
f ∈ Map(X, Y ). If this is not indeed true, let
Bf ((xi )I ; (Vi )I ) = {g ∈ Map(X, Y ) : (g(xi ) ∈
  I                                                      Definition A.1.4. A directed set D is a preordered
          i=1     i=1
    I
Vi )i=1 }, which denotes those members of T B that       set for which the order , known as a direction of
contain f with Vi an open neighborhood of f (xi )        D, satisfies
in Y , be a countable local base at f , see Theo-        (a) α ∈ D ⇒ α α (that is is reflexive).
rem A.1.2. Since X is uncountable, it is now pos-        (b) α, β, γ ∈ D such that (α β ∧β γ) ⇒ α γ
sible to choose some x∗ ∈ X different from any of             (that is is transitive).
the (xi )I (for example, let x∗ ∈ R be an irrational
          i=1                                            (c) α, β ∈ D ⇒ ∃γ ∈ D such that (α γ ∧ β γ).
for rational (xi )I ), and let f (x∗ ) ∈ V ∗ where V ∗
                   i
is an open neighborhood of f (x∗ ). Then B(x∗ ; V ∗ )    While the first two properties are obvious enough,
is an open set in Map(X, Y ) containing f ; hence        the third which replaces antisymmetry, ensures that
from the definition of the local base, Eq. (A.1), or      for any finite number of elements of the directed set,
equivalently from the Corollary to Theorem A.1.2,        there is always a successor (upper bound). Exam-
there exists some (countable) I ∈ N such that            ples of directed sets can be both straight forward,
f ∈ B I ⊆ B(x∗ ; V ∗ ). However,                         as any totally ordered set like N, R, Q, or Z and
                                                         all subsets of a set X under the superset or subset
             yi ∈ V i ,   if x = xi , and 1 ≤ i ≤ I
            
                                                         relation (that is (P(X), ⊇) or (P(X), ⊆) that are
    ∗
  f (x) =     y∗ ∈ V ∗ ,   if x = x∗                     directed by their usual ordering, and not quite so
                                                         obvious as the following examples which are signifi-
            
              arbitrary, otherwise
                                                         cantly useful in dealing with convergence questions
is a simple example of a function on X that is in B I    in topological spaces, amply illustrate.
(as it is immaterial as to what values the function           The neighborhood system
takes at points other than those defining B I ), but
not in B(x∗ ; V ∗ ). From this it follows that a suffi-                      DN   = {N : N ∈ Nx }
cient condition for the topology of pointwise conver-    at a point x ∈ X, directed by the reverse inclusion
gence to be first countable is that X be countable.       direction defined as
     Even though it is not first countable,                      M      N ⇔N ⊆M        for M, N ∈ Nx ,    (A.7)
(Map(X, Y ), T ) is a Hausdorff space when Y is           is a fundamental example of a natural direction of
Hausdorff. Indeed, if f , g ∈ (Map(X, Y ), T ) with       Nx . In fact while reflexivity and transitivity are
f = g, then f (x) = g(x) for some x ∈ X. But             clearly obvious, (c) follows because for any M, N ∈
then as Y is Hausdorff, it is possible to choose dis-     Nx , M M N and N M N . Of course, this
joint open intervals Vf and Vg at f (x) and g(x)         direction is not a total ordering on N x . A more nat-
respectively.                                            urally useful directed set in convergence theory is
     With this background on first and second
countability, it is now possible to go back to the              D Nt   = {(N, t) : (N ∈ Nx )(t ∈ N )}    (A.8)
question of nets, filters and sequences. Technically,     under its natural direction
a sequence on a set X is a map x : N → X from the          (M, s)       (N, t) ⇔ N ⊆ M      for M, N ∈ Nx ;
set of natural numbers to X; instead of denoting
this is in the usual functional manner of x(i) with                                                      (A.9)
i ∈ N, it is the standard practice to use the nota-      D Nt  is more useful than D N because, unlike the
tion (xi )i∈N for the terms of a sequence. However, if   latter, D Nt does not require a simultaneous choice
Toward a Theory of Chaos     3211

of points from every N ∈ Nx that implicitly in-                Definition A.1.7. A net χ : D → X converges to
volves a simultaneous application of the Axiom of              x ∈ X if it is eventually in every neighborhood of
Choice; see Example A.1.3 below. The general in-               x, that is
dexed variation
                                                                        (∀N ∈ Nx )(∃µ ∈ D)(χ(ν          µ) ∈ N ) .
     D Nβ   = {(N, β) : (N ∈ Nx )(β ∈ D)(xβ ∈ N )}             The point x is known as the limit of χ and the col-
                                               (A.10)          lection of all limits of a net is the limit set
of Eq. (A.8), with natural direction                               lim(χ) = {x ∈ X : (∀N ∈ Nx )(∃Rβ ∈ Res(D))
     (M, α) ≤ (N, β) ⇔ (α      β) ∧ (N ⊆ M ) , (A.11)                       (χ(Rβ ) ⊆ N )}                (A.12)
often proves useful in applications as will be clear           of χ, with the set of residuals Res(D) in D given by
from the proofs of Theorems A.1.3 and A.1.4.                         Res(D) = {Rα ∈ P(D) : Rα = {β ∈ D
Definition A.1.5 (Net).     Let X be any set and D                             for all β α ∈ D}} .                    (A.13)
a directed set. A net χ : D → X in X is a function             The net adheres at x ∈ X 27 if it is frequently in
on the directed set D with values in X.                        every neighborhood of x, that is

     A net, to be denoted as χ(α), α ∈ D, is there-                ((∀N ∈ Nx )(∀µ ∈ D))((∃ν           µ) : χ(ν) ∈ N ) .
fore a function indexed by a directed set. We adopt            The point x is known as the adherent of χ and the
the convention of denoting nets in the manner of               collection of all adherents of χ is the adherent set
functions and do not use the sequential notation χ α           of the net, which may be expressed in terms of the
that can also be found in the literature. Thus, while          cofinal subset of D
every sequence is a special type of net, χ : Z → X
is an example of a net that is not a sequence.                         Cof(D) = {Cα ∈ P(D) : Cα = {β ∈ D
     Convergence of sequences and nets are de-                                  for some β α ∈ D}}       (A.14)
scribed most conveniently in terms of the notions of           (thus Dα is cofinal in D iff it intersects every residual
being eventually in and frequently in every neigh-             in D), as
borhood of points. We describe these concepts in
terms of nets which apply to sequences with obvi-                 adh(χ) = {x ∈ X : (∀N ∈ Nx )(∃Cβ ∈ Cof(D))
ous modifications.                                                          (χ(Cβ ) ⊆ N )}.               (A.15)
Definition A.1.6. A net χ : D → X is said to be                 This recognizes, in keeping with the limit set, each
                                                               subnet of a net to be a net in its own right, and is
(a) Eventually in a subset A of X if its tail is even-         equivalent to
    tually in A: (∃β ∈ D) : (∀γ β)(χ(γ) ∈ A).
(b) Frequently in a subset A of X if for any index                adh(χ) = {x ∈ X : (∀N ∈ Nx )(∀Rα ∈ Res(D))
    β ∈ D, there is a successor index γ ∈ D such                              (χ(Rα )     N = ∅)} .                  (A.16)
    that χ(γ) is in A: (∀β ∈ D)(∃γ β) : (χ(γ) ∈
    A).                                                              Intuitively, a sequence is eventually in a set A
                                                               if it is always in it after a finite number of terms (of
It is not difficult to appreciate that                           course, the concept of a finite number of terms is
                                                               unavailable for nets; in this case the situation may
 (i) A net eventually in a subset is also frequently
                                                               be described by saying that a net is eventually in A
     in it but not conversely,
                                                               if its tail is in A) and it is frequently in A if it always
(ii) A net eventually (respectively, frequently) in a
                                                               returns to A to leave it again. It can be shown that
     subset cannot be frequently (respectively, even-
                                                               a net is eventually (resp. frequently) in a set iff it is
     tually) in its complement.
                                                               not frequently (resp. eventually) in its complement.
    With these notions of eventually in and fre-                     The following examples illustrate graphically
quently in, convergence characteristics of a net may           the role of a proper choice of the index set D in
be expressed as follows.                                       the description of convergence.

27
  This is also known as a cluster point; we shall, however, use this new term exclusively in the sense of the elements of a
derived set, see Definition 2.3.
3212     A. Sengupta

Example A.1.3. (1) Let γ ∈ D. The eventually            to yield a self-consistent tool for the description of
constant net χ(δ) = x for δ γ converges to x.           convergence.
(2) Let Nx be a neighborhood system at a point x in          As compared with sequences where, the index
X and suppose that the net (χ(N ))N ∈Nx is defined       set is restricted to positive integers, the considerable
by                                                      freedom in the choice of directed sets as is abun-
                                                        dantly borne out by the two preceding examples,
                       def
                 χ(M ) = s ∈ M ;              (A.17)    is not without its associated drawbacks. Thus as
here the directed index set D N is ordered by the       a trade-off, the wide range of choice of the directed
natural direction (A.7) of Nx . Then χ(N ) → x be-      sets may imply that induction methods, so common
cause given any x-neighborhood M ∈ D N , it follows     in the analysis of sequences, need no longer apply
from                                                    to arbitrary nets.
                                                        (4) The non-convergent nets (actually these are
   M        N ∈ D N ⇒ χ(N ) = t ∈ N ⊆ M       (A.18)
                                                        sequences)
that a point in any subset of M is also in M ; χ(N )
                                                        (a) (1, −1, 1, −1, . . .) adheres at 1 and −1 and
is therefore eventually in every neighborhood of x.
                                                                    n,                 if n is odd
(3) This slightly more general form of the previous     (b) xn =
                                                                    1 − 1/(1 + n), if n is even
example provides a link between the complimentary
concepts of nets and filters that is considered below.   adheres at 1 for its even terms, but is unbounded
For a point x ∈ X, and M, N ∈ Nx with the corre-        in the odd terms.
sponding directed set D Ms of Eq. (A.8) ordered by
                                                             A converging sequence or net is also adhering
its natural order (A.9), the net
                                                        but, as examples (4) show, the converse is false.
                             def                        Nevertheless it is true, as again is evident from ex-
                   χ(M, s) = s                (A.19)
                                                        amples (4), that in a first countable space where
converges to x because, as in the previous example,     sequences suffice, a sequence (xn ) adheres to x iff
for any given (M, s) ∈ D Ns , it follows from           some subsequence (xnm )m∈N of (xn ) converges to x.
(M, s)       (N, t) ∈ D Ms ⇒ χ(N, t) = t ∈ N ⊆ M        If the space is not first countable this has a corre-
                                             (A.20)     sponding equivalent formulation for nets with sub-
                                                        nets replacing subsequences as follows.
that χ(N, t) is eventually in every neighborhood             Let (χ(α))α∈D be a net. A subnet of χ(α) is
M of x. The significance of the directed set D Nt        the net ζ(β) = χ(σ(β)), β ∈ E, where σ : (E, ≤) →
of Eq. (A.8), as compared to D N , is evident from      (D, ) is a function that captures the essence of the
the net that it induces without using the Axiom of      subsequential mapping n → nm in N by satisfying
Choice: For a subset A of X, the net χ(N, t) = t ∈ A
indexed by the directed set                             (SN1) σ is an increasing order-preserving function:
                                                        it respects the order of E: σ(β)    σ(β ) for every
   D Nt   = {(N, t) : (N ∈ Nx )(t ∈ N   A)}   (A.21)    β ≤ β ∈ E, and
under the direction of Eq. (A.9), converges to x ∈ X    (SN2) For every α ∈ D there exists a β ∈ E such
with all such x defining the closure Cl(A) of A. Fur-    that α σ(β).
thermore taking the directed set to be                  These generalize the essential properties of a subse-
  D Nt   = {(N, t) : (N ∈ Nx )(t ∈ N    A − {x})}       quence in the sense that (1) Even though the index
                                                        sets D and E may be different, it is necessary that
                                              (A.22)
                                                        the values of E be contained in D, and (2) There
which, unlike Eq. (A.21), excludes the point x that     are arbitrarily large α ∈ D such that χ(α = σ(β))
may or may not be in the subset A of X, induces         is a value of the subnet ζ(β) for some β ∈ E. Re-
the net χ(N, t) = t ∈ A − {x} converging to x ∈ X,      calling the first of the order relations Eq. (38) on
with the set of all such x yielding the derived set     Map(X, Y ), we will denote a subnet ζ of χ by ζ χ.
Der(A) of A. In contrast, Eq. (A.21) also includes           We now consider the concept of filter on a set
the isolated points t = x of A so as to generate        X that is very useful in visualizing the behavior
its closure. Observe how neighborhoods of a point,      of sequences and nets, and in fact filters constitute
which define convergence of nets and filters in a         an alternate way of looking at convergence ques-
topological space X, double up here as index sets       tions in topological spaces. A filter F on a set X
Toward a Theory of Chaos     3213

is a collection of nonempty subsets of X satisfying              space and F a filter on X. Then
properties (F1) − (F3) below that are simply those
                                                                 lim(F) = {x ∈ X : (∀N ∈ Nx )(∃F ∈ F)(F ⊆ N )}
of a neighborhood system Nx without specification
of the reference point x.                                                                                (A.23)

(F1) The empty set ∅ does not belong to F,                       and
(F2) The intersection of any two members of a fil-                      adh(F) = {x ∈ X : (∀N ∈ Nx )(∀F ∈ F)
     ter is another member of the filter: F 1 , F2 ∈
                                                                                    (F     N = ∅)}                    (A.24)
     F ⇒ F1 F2 ∈ F,
(F3) Every superset of a member of a filter belongs               are respectively the sets of limit points and adherent
     to the filter: (F ∈ F) ∧ (F ⊆ G) ⇒ G ∈ F; in                 points of F 28
     particular X ∈ F.
                                                                     A comparison of Eqs. (A.12) and (A.16) with
Example A.1.4                                                    Eqs. (A.23) and (A.24) respectively demonstrate
                                                                 their formal similarity; this inter-relation between
(1) The indiscrete filter is the smallest filter on X.             filters and nets will be made precise in Defini-
(2) The neighborhood system Nx is the important                  tions A.1.10 and A.1.11 below. It should be clear
    neighborhood filter at x on X, and any local                  from the preceding two equations that
    base at x is also a filter-base for Nx . In general
    for any subset A of X, {N ⊆ X : A ⊆ Int(N )}                                lim(F) ⊆ adh(F) ,                     (A.26)
    is a filter on X at A.                                        with a similar result
(3) All subsets of X containing a point x ∈ X is the
    principal filter F P(x) on X at x. More gener-                                 lim(χ) ⊆ adh(χ)                     (A.27)
    ally, if F consists of all supersets of a nonempty           holding for nets because of the duality between nets
    subset A of X, then F is the principal filter                 and filters as displayed by Definitions A.1.9 and
    F P(A) = {N ⊆ X : A ⊆ Int(N )} at A. By                      A.1.10 below, with the equality in Eqs. (A.26) and
    adjoining the empty set to this filter give the               (A.27) being true (but not characterizing) for ultra-
    p-inclusion and A-inclusion topologies on X, re-             filters and ultranets respectively, see Example 4.2(3)
    spectively. The single element sets {{x}} and                for an account of this notion. It should be clear from
    {A} are particularly simple examples of filter-               the equations of Definition A.1.8 that
    bases that generate the principal filters at x
                                                                  adh(F) = {x ∈ X : (∃ a finer filter G ⊇ F on X)
    and A.
(4) For an uncountable (resp. infinite) set X, all                          (G → x)}                        (A.28)
    cocountable (resp. cofinite) subsets of X consti-             consists of all the points of X to which some finer
    tute the cocountable (resp. cofinite or Frechet)              filter G (in the sense that F ⊆ G implies every ele-
    filter on X. Again, adding to these filters the                ment of F is also in G) converges in X; thus
    empty set gives the respective topologies.
                                                                              adh(F) =        lim(G : G ⊇ F) ,
     Like the topological and local bases T B and Bx
respectively, a subclass of F may be used to define               which corresponds to the net-result of Theo-
a filter-base F B that in turn generate F on X, just              rem A.1.5 below, that a net χ adheres to x iff there
as it is possible to define the concepts of limit and             is some subnet of χ that converges to x in X. Thus
adherence sets for a filter to parallel those for nets            if ζ χ is a subnet of χ and F ⊆ G is a filter coarser
that follow straightforwardly from Definition A.1.7,              than G then
taken with Definition A.1.11.                                            lim(χ) ⊆ lim(ζ)          lim(F) ⊆ lim(G)
Definition A.1.8. Let (X, T ) be a topological                          adh(ζ) ⊆ adh(χ)          adh(G) ⊆ adh(F) ;

28
     The restatement
                                                  F → x ⇔ Nx ⊆ F                                                       (A.25)
of Eq. (A.23) that follows from (F3), and sometimes taken as the definition of convergence of a filter, is significant as it ties
up the algebraic filter with the topological neighborhood system to produce the filter theory of convergence in topological
spaces. From the defining properties of F it follows that for each x ∈ X, Nx is the coarsest (that is smallest) filter on X that
converges to x.
3214   A. Sengupta

a filter G finer than a given filter F corresponds            supersets F SΣ∧ . F(F S) :=F SΣ∧ is the smallest fil-
to a subnet ζ of a given net χ. The implication of         ter on X that contains F S and is the filter generated
this correspondence should be clear from the asso-         by F S.
ciation between nets and filters contained in Defini-             Equation (A.24) can be put in the more useful
tions A.1.10 and A.1.11.                                   and transparent form given by
     A filter-base in X is a non-empty family
                                                           Theorem A.1.3. For a filter F in a space (X, T )
(Bα )α∈D = F B of subsets of X characterized by
(FB1) There are no empty sets in the collection F B:                     adh(F) =           Cl(F )
(∀α ∈ D)(Bα = ∅)                                                                     F ∈F
                                                                                                           (A.31)
(FB2) The intersection of any two members of F B                                 =            Cl(B) ,
contains another member of F B: Bα , Bβ ∈ F B ⇒                                      B ∈ FB

(∃B ∈ F B : B ⊆ Bα Bβ );                                   and dually adh(χ), are closed sets.
hence any class of subsets of X that does not con-
                                                           Proof. Follows immediately from the definitions for
tain the empty set and is closed under finite inter-
                                                           the closure of a set Eq. (20) and the adherence of a
sections is a base for a unique filter on X; compare
                                                           filter Eq. (A.24). As always, it is a matter of conve-
the properties (NB1) and (NB2) of a local basis
                                                           nience in using the basic filters F B instead of F to
given at the beginning of this Appendix. Similar to
                                                           generate the adherence set.
Definition A.1.1 for the local base, it is possible to
define                                                          It is in fact true that the limit sets lim(F) and
                                                           lim(χ) are also closed set of X; the arguments in-
Definition A.1.9. A filter-base      FB
                                    in a set X is a
                                                           volving ultrafilters are omitted.
subcollection of the filter F on X having the prop-
                                                               Similar to the notion of the adherence set of
erty that each F ∈ F contains some member of F B.
                                                           a filter is its core — a concept that unlike the ad-
Thus
                                                           herence, is purely set-theoretic being the infimum
       def
  FB   = {B ∈ F : B ⊆ F for each F ∈ F} (A.29)             of the filter and is not linked with any topological
                                                           structure of the underlying (infinite) set X — de-
determines the filter                                       fined as
  F = {F ⊆ X : B ⊆ F for some B ∈ F B} (A.30)                            core(F) =          F.             (A.32)
                                                                                     F ∈F
reciprocally as all supersets of the basic elements.
                                                           From Theorem A.1.3 and the fact that the closure
This is the smallest filter on X that contains F B and      of a set A is the smallest closed set that contains A,
is said to be the filter generated by its filter-base F B;   see Eq. (25) at the end of Tutorial 4, it is clear that
alternatively F B is the filter-base of F. The entire       in terms of filters
neighborhood system Nx , the local base Bx , Nx A                            A = core(F P(A))
for x ∈ Cl(A), and the set of all residuals of a di-                     Cl(A) = adh(F P(A))                (A.33)
rected set D are among the most useful examples of
filter-bases on X, A and D respectively. Of course,                             = core(Cl(F P(A)))
every filter is trivially a filter-base of itself, and the   where F P(A) is the principal filter at A; thus the
singletons {{x}}, {A} are filter-bases that generate        core and adherence sets of the principal filter at A
the principal filters F P(x) and F P(A) at x, and A         are equal respectively to A and Cl(A) — a classic ex-
respectively.                                              ample of equality in the general relation Cl( Aα ) ⊆
      Paralleling the case of topological subbase T S,        Cl(Aα ) — but both are empty, for example, in
a filter subbase F S can be defined on X to be any           the case of an infinitely decreasing family of ratio-
collection of subsets of X with the finite intersection     nals centered at any irrational (leading to a princi-
property (as compared with T S where no such condi-        pal filter-base of rationals at the chosen irrational).
tion was necessary, this represents the fundamental        This is an important example demonstrating that
point of departure between topology and filter) and         the infinite intersection of a non-empty family of
it is not difficult to deduce that the filter generated       (closed ) sets with the finite intersection property
by F S on X is obtained by taking all finite inter-         may be empty, a situation that cannot arise on
sections F S∧ of members of F S followed by their          a finite set or an infinite compact set. Filters on
Toward a Theory of Chaos   3215

X with an empty core are said to be free, and             (ii) χ is frequently in A ⇒ (∀Rα ∈ Res(D))
are fixed otherwise: notice that by its very defini-             (A χ(Rα ) = ∅) ⇒ A Fχ = ∅.
tion filters cannot be free on a finite set, and a
free filter represents an additional feature that may      Limits and adherences are obviously preserved in
arise in passing from finite to infinite sets. Clearly      switching between nets (respectively, filters) and
(adh(F) = ∅) ⇒ (core(F) = ∅), but as the im-              the filters (respectively, nets) that they generate:
portant example of the rational space in the reals          lim(χ) = lim(Fχ ),      adh(χ) = adh(Fχ )      (A.34)
illustrate, the converse need not be true. Another
example of a free filter of the same type is provided       lim(F) = lim(χF ),       adh(F) = adh(χF ) . (A.35)
by the filter-base {[a, ∞) : a ∈ R} in R. Both these
                                                          The proofs of the two parts of Eq. (A.34), for ex-
examples illustrate the important property that a
                                                          ample, go respectively as follows. x ∈ lim(χ) ⇔ χ is
filter is free iff it contains the cofinite filter, and the
                                                          eventually in Nx ⇔ (∀N ∈ Nx )(∃F ∈ Fχ ) such that
cofinite filter is the smallest possible free filter on an
                                                          (F ⊆ N ) ⇔ x ∈ lim(Fχ ), and x ∈ adh(χ) ⇔ χ is
infinite set. The free cofinite filter, as these examples
                                                          frequently in Nx ⇔ (∀N ∈ Nx )(∀F ∈ Fχ )(N F =
illustrate, may be typically generated as follows. Let
                                                          ∅) ⇔ x ∈ adh(Fχ ); here F is a superset of χ(Rα ).
A be a subset of X, x ∈ Bdy X−A (A), and consider
                                                              Some examples of convergence of filters are
the directed set Eq. (A.21) to generate the corre-
sponding net in A given by χ(N ∈ Nx , t) = t ∈ A.         (1) Any filter on an indiscrete space X converges
Quite clearly, the core of any Frechet filter based on         to every point of X.
this net must be empty as the point x does not lie        (2) Any filter on a space that coincides with its
in A. In general, the intersection is empty because           topology (minus the empty set, of course) con-
if it were not so then the complement of the inter-           verges to every point of the space.
section — which is an element of the filter — would        (3) For each x ∈ X, the neighborhood filter N x
be infinite in contravention of the hypothesis that            converges to x; this is the smallest filter on X
the filter is Frechet. It should be clear that every fil-       that converges to x.
ter finer than a free filter is also free, and any filter    (4) The indiscrete filter F = {X} converges to no
coarser than a fixed filter is fixed.                            point in the space (X, {∅, A, X − A, X}), but
      Nets and filters are complimentary concepts              converges to every point of X − A if X has the
and one may switch from one to the other as fol-              topology {∅, A, X} because the only neighbor-
lows.                                                         hood of any point in X − A is X which is con-
                                                              tained in the filter.
Definition A.1.10. Let F be a filter on X and let
D Fx = {(F, x) : (F ∈ F)(x ∈ F )} be a directed set            One of the most significant consequences of con-
with its natural direction (F, x) (G, y) ⇒ (G ⊆           vergence theory of sequences and nets, as shown by
F ). The net χF : D Fx → X defined by                      the two theorems and the corollary following, is that
                                                          this can be used to describe the topology of a set.
                    χF (F, x) = x
                                                          The proofs of the theorems also illustrate the close
is said to be associated with the filter F, see            inter-relationship between nets and filters.
Eq. (A.20).
                                                          Theorem A.1.4. For a subset A of a topological
Definition A.1.11. Let χ : D → X be a net and              space X,
Rα = {β ∈ D : β     α ∈ D} a residual in D. Then
                                                            Cl(A) = {x ∈ X : (∃ a net χ in A)(χ → x)} .
          def
   F Bχ   = {χ(Rα ) : Res(D) → X for all α ∈ D}                                                    (A.36)
is the filter-base associated with χ, and the corre-       Proof. Necessity. For x ∈ Cl(A), construct a net
sponding filter Fχ obtained by taking all supersets        χ → x in A as follows. Let Bx be a topological local
of the elements of F Bχ is the filter associated with χ.   base at x, which by definition is the collection of all
    F Bχ is a filter-base in X because χ( Rα ) ⊆
                                                          open sets of X containing x. For each β ∈ D, the
  χ(Rα ), that holds for any functional relation,         sets
proves (FB2). It is not difficult to verify that                         Nβ =         {Bα : Bα ∈ Bx }
(i) χ is eventually in A ⇒ A ∈ Fχ , and                                       α β
3216    A. Sengupta

form a nested decreasing local neighborhood fil-                   Theorem A.1.5. If χ is a net in a topological space
ter base at x. With respect to the directed set                   X, then x ∈ adh(χ) iff some subnet ζ(β) = χ(σ(β))
D Nβ = {(Nβ , β) : (β ∈ D)(xβ ∈ Nβ )} of Eq. (A.10),              of χ(α), with α ∈ D and β ∈ E, converges in X to
define the desired net in A by                                     x; thus

               χ(Nβ , β) = xβ ∈ Nβ          A                      adh(χ) = {x ∈ X : (∃ a subnet ζ          χ in X)(ζ → x)}.
                                                                                                                      (A.39)
where the family of non-empty decreasing subsets
Nβ A of X constitute the filter-base in A as re-                   Proof. Necessity. Let x ∈ adh(χ). Define a subnet
quired by the directed set D Nβ . It now follows from             function σ :D Nα → D by σ(Nα , α) = α where D Nα
Eq. (A.11) and the arguments in Example A.1.3(3)                  is the directed set of Eq. (A.10): (SN1) and (SN2)
that xβ → x; compare the directed set of Eq. (A.21)               are quite evidently satisfied according to Eq. (A.11).
for a more compact, yet essentially identical, argu-              Proceeding as in the proof of the preceding theorem
ment. Carefully observe the dual roles of N x as a                it follows that xβ = χ(σ(Nα , α)) = ζ(Nα , α) → x
neighborhood filter base at x.                                     is the required converging subnet that exists from
     Sufficiency. Let χ be a net in A that con-                     Eq. (A.15) and the fact that χ(Rα ) Nα = ∅ for
verges to x ∈ X. For any Nα ∈ Nx , there is a                     every Nα ∈ Nx , by hypothesis.
Rα ∈ Res(D) of Eq. (A.13) such that χ(Rα ) ⊆ Nα .                      Sufficiency. Assume now that χ has a subnet
Hence the point χ(α) = xα of A belongs to Nα so                   ζ(Nα , α) that converges to x. If χ does not adhere
that A Nα = ∅ which means, from Eq. (20), that                    at x, there is a neighborhood Nα of x not frequented
x ∈ Cl(A).                                                        by it, in which case χ must be eventually in X −N α .
                                                                  Then ζ(Nα , α) is also eventually in X − Nα so that
Corollary.     Together with Eqs. (20) and (22), it fol-          ζ cannot be eventually in Nα , a contradiction of the
lows that                                                         hypothesis that ζ(Nα , α) → x.29

Der(A) = {x ∈ X : (∃ a net ζ in A − {x})(ζ → x)}                       Equations (A.36) and (A.39) imply that the clo-
                                           (A.37)                 sure of a subset A of X is the class of X-adherences
                                                                  of all the (sub)nets of X that are eventually in A.
The filter forms of Eqs. (A.36) and (A.37)                         This includes both the constant nets yielding the
       Cl(A) = {x ∈ X : (∃ a filter F on X)                        isolated points of A and the non-constant nets lead-
               (A ∈ F)(F → x)}                                    ing to the cluster points of A, and implies the fol-
                                                      (A.38)      lowing physically useful relationship between con-
     Der(A) = {x ∈ X : (∃ a filter F on X)                         vergence and topology that can be used as defining
              (A − {x} ∈ F)(F → x)}                               criteria for open and closed sets having a more ap-
                                                                  pealing physical significance than the original def-
then follows from Eq. (A.25) and the finite inter-
                                                                  initions of these terms. Clearly, the term “net” is
section property (F2) of F so that every neighbor-
hood of x must intersect A (respectively A − {x}) in              justifiably used here to include the subnets too.
Eq. (A.38) to produce the converging net needed in                     The following corollary of Theorem A.1.5 sum-
the proof of Theorem A.1.3.                                       marizes the basic topological properties of sets in
                                                                  terms of nets (respectively, filters).
     We end this discussion of convergence in topo-
                                                                  Corollary.      Let A be a subset of a topological space
logical spaces with a proof of the following theorem
                                                                  X. Then
which demonstrates the relationship that “eventu-
ally in” and “frequently in” bears with each other;               (1) A is closed in X iff every convergent net of
Eq. (A.39) below is the net-counterpart of the filter                  X that is eventually in A actually converges
equation (A.28).                                                      to a point in A (respectively, iff the adhering

29
   In a first countable space, while the corresponding proof of the first part of the theorem for sequences is essentially the same
as in the present case, the more direct proof of the converse illustrates how the convenience of nets and directed sets may
require more general arguments. Thus if a sequence (xi )i∈N has a subsequence (xik )k∈N converging to x, then a more direct
line of reasoning proceeds as follows. Since the subsequence converges to x, its tail (xik )k≥j must be in every neighborhood
N of x. But as the number of such terms is infinite whereas {ik : k  j} is only finite, it is necessary that for any given n ∈ N,
cofinitely many elements of the sequence (xik )ik ≥n be in N . Hence x ∈ adh((xi )i∈N ).
Toward a Theory of Chaos   3217

    points of each filter-base on A all belong to A).              to x unless it is of the uncountable type 30
    Thus no X-convergent net in a closed subset
    may converge to a point outside it.                                   (x0 , x1 , . . . , xI , xI+1 , xI+1 , . . .)   (A.40)
(2) A is open in X iff every convergent net of X                   with only a finite number I of distinct terms ac-
    that converges to a point in A is eventually in               tually belonging to the closed sequential set F =
    A. Thus no X-convergent net outside an open                   X −G, and xI+1 = x. Note that as we are concerned
    subset may converge to a point in the set.                    only with the eventual behavior of the sequence, we
(3) A is closed-and-open (clopen) in X iff every                   may discard all distinct terms from G by consider-
    convergent net of X that converges in A is even-              ing them to be in F , and retain only the constant
    tually in A and conversely.                                   sequence (x, x, . . .) in G. In comparison with the
(4) x ∈ Der(A) iff some net (respectively, filter-                  cofinite case that was considered in Sec. 4, the en-
    base) in A − {x} converges to x; this clearly                 tire countably infinite sequence can now lie outside
    eliminates the isolated points of A and x ∈                   a neighborhood of x thereby enforcing the eventual
    Cl(A) iff some net (respectively, filter-base) in               constancy of the sequence. This leads to a gener-
    A converges to x.                                             alization of our earlier cofinite result in the sense
                                                                  that a cocountable filter on a cocountable space con-
Remark. The differences in these characterizations
                                                                  verges to every point in the space.
should be fully appreciated: If we consider the clus-
                                                                       It is now straightforward to verify that for a
ter points Der(A) of a net χ in A as the resource
                                                                  point x0 in an uncountable cocountable space X
generated by χ, then a closed subset of X can be
considered to be selfish as it keeps all it resource               (a) Even though no sequence in the open set G =
to itself: Der(A) ∩ A = Der(A). The opposite of                       X − {x0 } can converge to x0 , yet x0 ∈ Cl(G)
this is a donor set that donates all its generated re-                since the intersection of any (uncountable) open
sources to its neighbor: Der(A) ∩ X − A = Der(A),                     neighborhood U of x0 with G, being an un-
while for a neutral set, both Der(A) ∩ A = ∅ and                      countable set, is not empty.
Der(A) ∩ X − A = ∅ implying that the convergence                  (b) By Corollary 1 of Theorem A.1.5, the uncount-
resources generated in A and X − A can be de-                         able open set G = X − {x0 } is also closed in X
posited only in the respective sets. The clopen sets                  because if any sequence (x1 , x2 , . . .) in G con-
(see diagram 2–2 of Fig. 22) are of some special                      verges to some x ∈ X, then x must be in G
interest as they are boundary less so that no net-                    as the sequence must be eventually constant in
resources can be generated in this case as any such                   order for it to converge. But this is a contra-
limit are required to be simultaneously in the set                    diction as G cannot be closed since it is not
and its complement.                                                   countable.31 By the same reckoning, although
                                                                      {x0 } is not an open set because its complement
Example A.1.2. (Continued). This continuation                         is not countable, nevertheless it follows from
Example A.1.2 illustrates how sequential conver-                      Eq. (A.40) that should any sequence converge
gence is inadequate in spaces that are not first                       to the only point x0 of this set, then it must
countable like the uncountable set with cocountable                   eventually be in {x0 } so by Corollary 2 of the
topology. In this topology, a sequence can converge                   same theorem, {x0 } becomes an open set.
to a point x in the space iff it has only a finite num-             (c) The identity map 1 : X → Xd , where Xd is
ber of distinct terms, and is therefore eventually                    X with discrete topology, is not continuous be-
constant. Indeed, let the complement                                  cause the inverse image of any singleton of X d is
           def                                                        not open in X. Yet if a sequence converges in X
         G = X − F,       F = {xi : xi = x, i ∈ N}
                                                                      to x, then its image (1(x)) = (x) must actually
of the countably closed sequential set F be an open                   converge to x in Xd because a sequence con-
neighborhood of x ∈ X. Because a sequence (x i )i∈N                   verges in a discrete space, as in the cofinite or
in X converges to a point x ∈ X iff it is eventu-                      cocountable spaces, iff it is eventually constant;
ally in every neighborhood (including G) of x, the                    this is so because each element of a discrete
sequence represented by the set F cannot converge                     space being clopen is boundary-less.

30
     This is uncountable because interchanging any two eventual terms of the sequence does not alter the sequence.
31
     Note that {x} is a 1-point set but (x) is an uncountable sequence.
3218   A. Sengupta

     This pathological behavior of sequences in a            dropping all basic open sets that do not inter-
non Hausdorff, non first countable space does not              sect. Then a (coarser) topology can be gener-
arise if the discrete indexing set of sequences is re-       ated from this base by taking all unions, and
placed by a continuous, uncountable directed set             a filter by taking all supersets according to
like R for example, leading to nets in place of se-          Eq. (A.30). For any given filter this expression
quences. In this case the net can be in an open set          may be used to extract a subclass F B as a base
without having to be constant valued in order to             for F.
converge to a point in it as the open set can be de-
fined as the complement of a closed countable part        A.2.    Initial and Final Topology
of the uncountable net. The careful reader could
                                                         The commutative diagram of Fig. contains four
not have failed to notice that the burden of the
                                                         sub-diagrams X − XB − f (X), Y − XB − f (X),
above arguments, as also of that in the example
                                                         X − XB − Y and X − f (X) − Y . Of these, the first
following Theorem 4.6, is to formalize the fact that
                                                         two are especially significant as they can be used to
since a closed set is already defined as a countable
                                                         conveniently define the topologies on X B and f (X)
(respectively finite) set, the closure operation cannot                                               −1
                                                         from those of X and Y , so that fB , fB and G
add further points to it from its complement, and
                                                         have some desirable continuity properties; we recall
any sequence that converges in an open set in these
                                                         that a function f : X → Y is continuous if inverse
topologies must necessarily be eventually constant
                                                         images of open sets of Y are open in X. This sim-
at its point of convergence, a restriction that no
                                                         ple notion of continuity needs refinement in order
longer applies to a net. The cocountable topology
                                                         that topologies on XB and f (X) be unambiguously
thus has the very interesting property of filtering
                                                         defined from those of X and Y , a requirement that
out a countable part from an uncountable set, as
                                                         leads to the concepts of the so-called final and initial
for example the rationals in R.
                                                         topologies. To appreciate the significance of these
                                                         new constructs, note that if f : (X, U) → (Y, V) is a
     This example serves to illustrate the hard truth
                                                         continuous function, there may be open sets in X
that in a space that is not first countable, the sim-
                                                         that are not inverse images of open — or for that
plicity of sequences is not enough to describe its
                                                         matter of any — subset of Y , just as it is possible
topological character, and in fact “sequential con-
                                                         for non-open subsets of Y to contribute to U. When
vergence will be able to describe only those topolo-
                                                         the triple {U, f, V} are tuned in such a manner
gies in which the number of (basic) neighborhoods
                                                         that these are impossible, the topologies so gener-
around each point is no greater than the number
                                                         ated on X and Y are the initial and final topologies
of terms in the sequences”, [Willard, 1970]. It is
                                                         respectively; they are the smallest (coarsest) and
important to appreciate the significance of this in-
                                                         largest (finest) topologies on X and Y that make
terplay of convergence of sequences and nets (and
                                                         f : X → Y continuous. It should be clear that every
of continuity of functions of Appendix A.1) and the
                                                         image and preimage continuous function is contin-
topology of the underlying spaces.
                                                         uous, but the converse is not true.
     A comparison of the defining properties (T1)–
                                                              Let sat(U ) := f − f (U ) ⊆ X be the saturation
(T3) of topology T with (F1)–(F3) of that of the
                                                         of an open set U of X and comp(V ) := f f − (V ) =
filter F, shows that a filter is very close to a topol-
                                                         V f (X) ∈ Y be the component of an open set V
ogy with the main difference being with regard to
                                                         of Y on the range f (X) of f . Let Usat , Vcomp de-
the empty set which must always be in T but never
                                                         note respectively the saturations U sat = {sat(U ) :
in F. Addition of the empty set to a filter yields
                                                         U ∈ U} of the open sets of X and the components
a topology, but removal of the empty set from a
                                                         Vcomp = {comp(V ) : V ∈ V} of the open sets of Y
topology need not produce the corresponding fil-
                                                         whenever these are also open in X and Y respec-
ter as the topology may contain nonintersecting
                                                         tively. Plainly, Usat ⊆ U and Vcomp ⊆ V.
sets.
     The distinction between the topological and         Definition A.2.1. For a function e : X → (Y, V),
filter-bases should be carefully noted. Thus              the preimage or initial topology of X based on
                                                         (generated by) e and V is
(a) While the topological base may contain the                     def
    empty set, a filter-base cannot.                      IT{e; V} = {U ⊆ X : U = e− (V ) if V ∈ Vcomp } ,
(b) From a given topology, form a common base by                                                   (A.41)
Toward a Theory of Chaos     3219

while for q : (X, U) → Y , the image or final topol-             (a) f is continuous iff g is continuous,
ogy of Y based on (generated by) U and q is                     (b) f is preimage continuous iff U1 = IT{g; V}.
             def
 FT{U; q} = {V ⊆ Y : q − (V ) = U if U ∈ Usat }.                As we need the second part of these theorems in
                                          (A.42)                our applications, their proofs are indicated below.
                                                                The special significance of the first parts is that they
Thus, the topology of (X, IT{e; V}) consists of, and
                                                                ensure the converse of the usual result that the com-
only of, the e-saturations of all the open sets of
                                                                position of two continuous functions is continuous,
e(X), while the open sets of (Y, FT{U; q}) are the
                                                                namely that one of the components of a composition
q-images in Y (and not just in q(X)) of all the q-
                                                                is continuous whenever the composition is so.
saturated open sets of X.32 The need for defining
(A.41) in terms of Vcomp rather than V will be-
                                                                Proof of Theorem A.2.1.    If f be image continuous,
come clear in the following. The subspace topol-
                                                                V1 = {V1 ⊆ Y1 : f − (V1 ) ∈ U1 } and U1 = {U1 ⊆ X1 :
ogy IT{i; U} of a subset A ⊆ (X, U) is a basic ex-
                                                                q − (U1 ) ∈ U} are the final topologies of Y1 and X1
ample of the initial topology by the inclusion map
                                                                based on the topologies of X1 and X, respectively.
i : X ⊇ A → (X, U), and we take its generalization
                                                                Then V1 = {V1 ⊆ Y1 : q − f − (V1 ) ∈ U} shows that h
e : (A, IT{e; V}) → (Y, V) that embeds a subset A
                                                                is image continuous.
of X into Y as the prototype of a preimage continu-
                                                                      Conversely, when h is image continuous, V 1 =
ous map. Clearly the topology of Y may also contain
                                                                {V1 ⊆ Y1 : h− (V1 )} ∈ U} = {V1 ⊆ Y1 :
open sets not in e(X), and any subset in Y − e(X)
                                                                q − f − (V1 )} ∈ U}, with U1 = {U1 ⊆ X1 : q − (U1 ) ∈
may be added to the topology of Y without alter-
                                                                U}, proves f − (V1 ) to be open in X1 and thereby f
ing the preimage topology of X: open sets of Y not
                                                                to be image continuous.
in e(X) may be neglected in obtaining the preimage
topology as e− (Y −e(X)) = ∅. The final topology on
a quotient set by the quotient map Q : (X, U) →                 Proof    of   Theorem    A.2.2.If f be preimage
X/ ∼, which is just the collection of Q-images of the           continuous, V1 = {V1 ⊆ Y1 : V1 = e− (V ) if V ∈ V}
Q-saturated open sets of X, known as the quotient               and U1 = {U1 ⊆ X1 : U1 = f − (V1 ) if V1 ∈ V1 }
topology of X/ ∼, is the basic example of the image             are the initial topologies of Y1 and X1 respectively.
topology and the resulting space (X/ ∼, FT{U; Q})               Hence from U1 = {U1 ⊆ X1 : U1 = f − e− (V ) if V ∈
is called the quotient space. We take the generaliza-           V} it follows that g is preimage continuous.
tion q : (X, U) → (Y, FT{U; q}) of Q as the proto-                  Conversely, when g is preimage continuous,
type of an image continuous function.                           U1 = {U1 ⊆ X1 : U1 = g − (V ) if V ∈ V} =
     The following results are specifically useful in            {U1 ⊆ X1 : U1 = f − e− (V ) if V ∈ V} and
dealing with initial and final topologies; compare               V1 = {V1 ⊆ Y1 : V1 = e− (V ) if V ∈ V} show
the corresponding results for open maps given later.            that f is preimage continuous.

Theorem A.2.1. Let (X, U) and (Y1 , V1 ) be topo-                    Since both Eqs. (A.41) and (A.42) are in terms
logical spaces and let X1 be a set. If f : X1 →                 of inverse images (the first of which constitutes a
(Y1 , V1 ), q : (X, U) → X1 , and h = f ◦ q :                   direct, and the second an inverse, problem) the im-
(X, U) → (Y1 , V1 ) are functions with the topology             age f (U ) = comp(V ) for V ∈ V is of interest as
U1 of X1 given by FT{U; q}, then                                it indicates the relationship of the openness of f
(a) f is continuous iff h is continuous,                         with its continuity. This, and other related concepts
(b) f is image continuous iff V1 = FT{U; h}.                     are examined below, where the range space f (X) is
                                                                always taken to be a subspace of Y . Openness of
Theorem A.2.2. Let (Y, V) and (X1 , U1 ) be topo-               a function f : (X, U) → (Y, V) is the “inverse” of
logical spaces and let Y1 be a set. If f : (X1 , U1 ) →         continuity, when images of open sets of X are re-
Y1 , e : Y1 → (Y, V) and g = e ◦ f : (X1 , U1 ) →               quired to be open in Y ; such a function is said to be
(Y, V) are function with the topology V 1 of Y1 given           open. Following are two of the important properties
by IT{e; V}, then                                               of open functions.

32
  We adopt the convention of denoting arbitrary preimage and image continuous functions by e and q respectively even though
they are not injective or surjective; recall that the embedding e : X ⊇ A → Y and the association q : X → f (X) are 1 : 1 and
onto respectively.
3220   A. Sengupta

(1) If f : (X, U) → (Y, f (U)) is an open function,       continuous. Indeed, from its injectivity and conti-
    then so is f : (X, U) → (f (X), IT{i; f (U)}).       nuity, inverse images of all open subsets of Y are
    The converse is true if f (X) is an open set of Y ;   saturated-open in X, and openness of f ensures that
    thus openness of f : (X, U) → (f (X), f (U))        these are the only open sets of X the condition of
    implies that of f : (X, U) → (Y, V) whenever          injectivity being required to exclude non-saturated
    f (X) is open in Y such that f (U ) ∈ V for U ∈      sets from the preimage topology. It is therefore pos-
    U. The truth of this last assertion follows eas-      sible to rewrite Eq. (A.41) as
    ily from the fact that if f (U ) is an open set of
    f (X) ⊂ Y, then necessarily f (U ) = V f (X)          U ∈ IT{e; V} ⇔ e(U ) = V if V ∈ Vcomp , (A.43)
    for some V ∈ V, and the intersection of two
    open sets of Y is again an open set of Y .            and to compare it with the following criterion for an
(2) If f : (X, U) → (Y, V) and g : (Y, V) → (Z, W)        injective, open-continuous map f : (X, U) → (Y, V)
    are open functions then g ◦ f : (X, U) → (Z, W)       that necessarily satisfies sat(A) = A for all A ⊆ X
    is also open. It follows that the condition in
    (1) on f (X) can be replaced by the require-
                                                          U ∈ U ⇔ ({{f (U )}U ∈U = Vcomp )∧(f −1 (V )|V ∈V ∈ U).
    ment that the inclusion i : (f (X), IT{i; V}) →
    (Y, V) be an open map. This interchange of                                                           (A.44)
    f (X) with its inclusion i: f (X) → Y into Y
    is a basic result that finds application in many       Final Topology. Since it is necessarily produced
    situations.                                           on the range R(q) of q, the final topology is often
                                                          considered in terms of a surjection. This however
    Collected below are some useful properties of         is not necessary as, much in the spirit of the ini-
the initial and final topologies that we need in this      tial topology, Y − q(X) = ∅ inherits the discrete
work.                                                     topology without altering anything, thereby allow-
                                                          ing condition (A.42) to be restated in the following
Initial Topology. In Fig. 21(b), consider Y 1 =           more transparent form
h(X1 ), e → i and f → h : X1 → (h(X1 ),
IT{i; V}). From h− (B) = h− (B h(X1 )) for any              V ∈ FT{U; q} ⇔ V = q(U ) if U ∈ Usat ,      (A.45)
B ⊆ Y , it follows that for an open set V of Y ,
h− (Vcomp ) = h− (V ) is an open set of X1 which, if      and to compare it with the following criterion for
the topology of X1 is IT{h; V}, are the only open         a surjective, open-continuous map f : (X, U) →
sets of X1 . Because Vcomp is an open set of h(X1 ) in    (Y, V) that necessarily satisfies f B = B for all
its subspace topology, this implies that the preim-       B⊆Y
age topologies IT{h; V} and IT{h ; IT{i; V}} of
X1 generated by h and h are the same. Thus the
preimage topology of X1 is not affected if Y is re-        V ∈ V ⇔ (Usat = {f − (V )}V ∈V ) ∧ (f (U )|U ∈ U ∈ V).
placed by the subspace h(X1 ), the part Y − h(X1 )                                                        (A.46)
contributing nothing to IT{h; V}.
     A preimage continuous function e : X → (Y, V)        As may be anticipated from Fig. 21, the final topol-
is not necessarily an open function. Indeed, if U =       ogy does not behave as well for subspaces as the ini-
e− (V ) ∈ IT{e; V}, it is almost trivial to verify        tial topology does. This is so because in Fig. 21(a)
along the lines of the restriction of open maps to        the two image continuous functions h and q are
its range, that e(U ) = ee− (V ) = e(X) V , V ∈ V,        connected by a preimage continuous inclusion f ,
is open in Y (implying that e is an open map) iff          whereas in Fig. 21(b) all the three functions are
e(X) is an open subset of Y (because finite in-            preimage continuous. Thus quite like open func-
tersections of open sets are open). A special case        tions, although image continuity of h : (X, U) →
of this is the important consequence that the re-         (Y1 , FT{U; h}) implies that of h : (X, U) →
striction e : (X, IT{e; V}) → (e(X), IT{i; V}) of        (h(X), IT{i; FT{U; h})) for a subspace h(X) of
e : (X, IT{h; V}) → (Y, V) to its range is an open        Y1 , the converse need not be true unless — en-
map. Even though a preimage continuous map need           tirely like open functions again — either h(X) is
not be open, it is true that an injective, continu-       an open set of Y1 or i : (h(X), IT{i; FT{U; h})) →
ous and open map f : X → (Y, V) is preimage               (X, FT{U; h}) is an open map. Since an open
Toward a Theory of Chaos   3221


                   ¨¦¤¢ 
                  § ¥ £ ¡                                                                                     T W¢5
                                                                                                                9 E   C




                     ©                                                                     3
                                                                                                                  D




             § 1© )'$£ ¢                                             BA9 8¢5
                                                                       C 7 @ 7 6                         C VSD PGF5
                                                                                                                 Q I H 9 7 E
               0 ( ¥ % # !  ¡          2           ¢ 
                                                   §   £                            4                  U T R



                                 (a)                                                            (b)

                                       Fig. 21.   Continuity in final and initial topologies.



preimage continuous map is image continuous, this                       The following is a slightly more general form of
makes i : h(X) → Y1 an ininal function and hence                    the restriction on the inclusion that is needed for
all the three legs of the commutative diagram image                 image continuity to behave well for subspaces of Y .
continuous.
     Like preimage continuity, an image continuous                  Theorem A.2.3. Let q : (X, U) → (Y, FT{U; q})
function q : (X, U) → Y need not be open. How-                      be an image continuous function. For a subspace B
ever, although the restriction of an image continu-                 of (Y, FT{U; q}),
ous function to the saturated open sets of its domain                       FT{IT{j; U}; q } = IT{i; FT{U; q}}
is an open function, q is unrestrictedly open iff the
saturation of every open set of X is also open in X.                where q : (q − (B), IT{j; U}) → (B, FT{IT{j; U};
In fact it can be verified without much effort that a                 q }), if either q is an open map or B is an open set
continuous, open surjection is image continuous.                    of Y .
     Combining Eqs. (A.43) and (A.45) gives the fol-
                                                                         In summary we have the useful result that an
lowing criterion for ininality
                                                                    open preimage continuous function is image con-
  U and V ∈ IFT{Usat ; f ; V}                                       tinuous and an open image continuous function is
                                                                    preimage continuous, where the second assertion
    ⇔ ({f (U )}U ∈Usat = V)(Usat = {f − (V )}V ∈V ) ,
                                                                    follows on neglecting non-saturated open sets in X;
                                              (A.47)                this is permitted in as far as the generation of the
                                                                    final topology is concerned, as these sets produce
which reduces to the following for a homeomor-
                                                                    the same images as their saturations. Hence an im-
phism f that satisfies both sat(A) = A for A ⊆ X
                                                                    age continuous function q : X → Y is preimage
and f B = B for B ⊆ Y
                                                                    continuous iff every open set in X is saturated with
    U and V ∈ HOM{U; f ; V}                                         respect to q, and a preimage continuous function
      ⇔ (U = {f −1 (V )}V ∈V )({f (U )}U ∈U = V)                    e : X → Y is image continuous iff the e-image of
                                                                    every open set of X is open in Y .
                                             (A.48)

and compares with                                                   A.3.     More on Topological Spaces
      U and V ∈ OC{U; f ; V}                                        This Appendix — which completes the review
                                                                    of those concepts of topological spaces begun in
        ⇔ (sat(U ) ∈ U : {f (U )}U ∈U = Vcomp )                     Tutorial 4 that are needed for a proper understand-
        ∧(comp(V ) ∈ V : {f − (V )}V ∈V = Usat )                    ing of this work — begins with the following sum-
                                             (A.49)                 mary of the different possibilities in the distribu-
                                                                    tion of Der(A) and Bdy(A) between sets A ⊆ X
for an open-continuous f .                                          and its complement X − A, and follows it up with a
3222       A. Sengupta

few other important topological concepts that have         Lemma A.3.2. If A is a subspace of X, a sepa-
been used, explicitly or otherwise, in this paper.         ration of A is a pair of disjoint nonempty subsets
                                                           H1 and H2 of A whose union is A neither of which
Definition A.3.1 (Separation, Connected Space).             contains a cluster point of the other. A is connected
A separation (disconnection) of X is a pair of mutu-       iff there is no separation of A.
ally disjoint nonempty open (and therefore closed)
subsets H1 and H2 such that X = H1 ∪ H2 . A space          Proof.  Let H1 and H2 be a separation of A so
X is said to be connected if it has no separation,         that they are clopen subsets of A whose union is
that is, if it cannot be partitioned into two open or      A. As H1 is a closed subset of A it follows that
two closed non-empty subsets. X is separated (dis-         H1 = ClX (H1 ) A, where ClX (H1 ) A is the clo-
connected) if it is not connected.                         sure of H1 in A; hence ClX (H1 ) H2 = ∅. But as
                                                           the closure of a subset is the union of the set and its
    It follows from the definition, that for a dis-         adherents, an empty intersection signifies that H 2
connected space X the following are equivalent             cannot contain any of the cluster points of H 1 . A
statements.                                                similar argument shows that H1 does not contain
(a) There exist a pair of disjoint non-empty open          any adherent of H2 .
    subsets of X that cover X,                                 Conversely suppose that neither H1 nor H2 con-
(b) There exist a pair of disjoint non-empty closed        tain an adherent of the other: ClX (H1 ) H2 = ∅
    subsets of X that cover X,                             and ClX (H2 ) H1 = ∅. Hence ClX (H1 ) A = H1
(c) There exist a pair of disjoint non-empty clopen        and ClX (H2 ) A = H2 so that both H1 and H2
    subsets of X that cover X,                             are closed in A. But since H1 = A − H2 and
(d) There exists a non-empty, proper, clopen subset        H2 = A−H1 , they must also be open in the relative
    of X.                                                  topology of A.

By a connected subset is meant a subset of X that
                                                               Following are some useful properties of con-
is connected when provided with its relative topol-
                                                           nected spaces.
ogy making it a subspace of X. Thus any connected
subset of a topological space must necessarily be          (c1) The closure of any connected subspace of a
contained in any clopen set that might intersect it:            space is connected. In general, every B satis-
if C and H are respectively connected and clopen                fying
subsets of X such that C H = ∅, then C ⊂ H be-                                      A ⊆ B ⊆ Cl(A)
cause C H is a non-empty clopen set in C which
must contain C because C is connected.                          is connected. Thus any subset of X formed
     For testing whether a subset of a topological              from A by adjoining to it some or all of its
space is connected, the following relativized form of           adherents is connected so that a topologi-
(a)–(d) is often useful.                                        cal space with a dense connected subset is
                                                                connected.
Lemma A.3.1. A subset A of X is disconnected iff            (c2) The union of any class of connected subspaces
there are disjoint open sets U and V of X satisfying            of X with nonempty intersection is a con-
       U      A=∅=V           A such that A ⊆ U   V,            nected subspace of X.
                                                           (c3) A topological space is connected iff there is a
                     with U     V   A=∅
                                                                covering of the space consisting of connected
                                                  (A.50)        sets with nonempty intersection. Connected-
or there are disjoint closed sets E and F of X                  ness is a topological property: Any space
satisfying                                                      homeomorphic to a connected space is itself
                                                                connected.
       E      A=∅=F           A such that A ⊆ E   F,       (c4) If H1 and H2 is a separation of X and A is any
                    with E      F   A = ∅.                      connected subset A of X, then either A ⊆ H 1
                                                  (A.51)        or A ⊆ H2 .

Thus A is disconnected iff there are disjoint clopen            While the real line R is connected, a subspace
subsets in the relative topology of A that cover A.        of R is connected iff it is an interval in R.
Toward a Theory of Chaos   3223


                                                                 X −A
                                          1. Donor         2. Selfish (Closed)        3. Neutral

                                                     X                    X                     X


                         1. Donor

                                          A                     A open               A


                                          X − A open           X − A open            X − A open

                         2. Selfish
                   A     (Closed)

                                          A                     A open               A


                                                     X                    X                     X


                        3. Neutral

                                          A                     A open               A



                         Der(A)         BdyX−A (A)            Der(X − A)           BdyA (X − A)


Fig. 22. Classification of a subset A of X relative to the topology of X. The derived set of A may intersect both A and
X − A (row 3), may be entirely in A (row 2), or may be wholly in X − A (row 1). A is closed iff Bdy(A) ⊆ A (row 2), open
iff Bdy(A) ⊆ X − A (column 2), and clopen iff Bdy(A) = ∅ when the derived sets of both A and X − A are contained in the
respective sets. An open set, beside being closed, may also be neutral or donor.



    The important concept of total disconnected-             ply must not be contained in any other connected
ness introduced below needs the following                    subset of X. Components can be constructively de-
                                                             fined as follows: Let x ∈ X be any point. Consider
Definition A.3.2 (Component).      A component C ∗            the collection of all connected subsets of X to which
of a space X is a maximally (with respect to                 x belongs. Since {x} is one such a set, the collection
inclusion) connected subset of X.                            is non-empty. As the intersection of the collection
                                                             is non-empty, its union is a non-empty connected
                                                     1
                                                             set C. This is the largest connected set containing
Thus a component is a connected subspace which               x and is therefore a component containing x and we
is not properly contained in any larger connected            have
subspace of X. The maximal element need not be
unique as there can be more than one component of            (C1) Let x ∈ X. The unique component of X con-
a given space and a “maximal” criterion rather than               taining x is the union of all the connected sub-
“maximum” is used as the component that need                      sets of X that contain x. Conversely any non-
not contain every connected subset of X; it sim-                  empty connected subset A of X is contained
3224    A. Sengupta

     in that unique component of X to which each              Table 7.     Separation properties of some useful
     of the points of A belong. Hence a topological           spaces.
     space is connected iff it is the unique compo-
                                                                           Space              T0     T1     T2
     nent of itself.
(C2) Each component C ∗ of X is a closed set of X:            Discrete
     By property (c1) above, Cl(C ∗ ) is also con-            Indiscrete                      ×      ×      ×
     nected and from C ∗ ⊆ Cl(C ∗ ) it follows that           R, standard
     C ∗ = Cl(C ∗ ). Components need not be open              left/right ray                         ×      ×
     sets of X: an example of this is the space of ra-        Infinite cofinite                               ×
     tionals Q in reals in which the components are           Uncountable cocountable                       ×
     the individual points which cannot be open in            x-inclusion/exclusion                  ×      ×
     R; see Example 2 below.                                  A-inclusion/exclusion           ×      ×      ×
(C3) Components of X are equivalence classes of
     (X, ∼) with x ∼ y iff they are in the same
     component: while reflexivity and symmetry
     are obvious enough, transitivity follows be-        (2, 3) can be enlarged into bigger connected subsets
     cause if x, y ∈ C1 and y, z ∈ C2 with C1 ,          of X.
     C2 connected subsets of X, then x and z                  As connected spaces, the empty set and the sin-
     are in the set C1 C2 which is connected by          gleton are considered to be degenerate and any con-
     property c(2) above as they have the point y        nected subspace with more than one point is non-
     in common. Components are connected dis-            degenerate. At the opposite extreme of the largest
     joint subsets of X whose union is X (i.e. they      possible component of a space X which is X itself,
     form a partition of X with each point of X          are the singletons {x} for every x ∈ X. This leads
     contained in exactly one component of X)            to the extremely important notion of a
     such that any connected subset of X can
     be contained in only one of them. Because
                                                         Definition A.3.3 (Totally disconnected space).      A
     a connected subspace cannot contain in it
                                                         space X is totally disconnected if every pair of dis-
     any clopen subset of X, it follows that every
                                                         tinct points in it can be separated by a disconnec-
     clopen connected subspace must be a compo-
                                                         tion of X.
     nent of X.

    Even when a space is disconnected, it is always      X is totally disconnected iff the components in
possible to decompose it into pairwise disjoint con-     X are single points with the only nonempty con-
nected subsets. If X is a discrete space this is the     nected subsets of X being the one-point sets: If
only way in which X may be decomposed into con-          x = y ∈ A ⊆ X are distinct points of a sub-
nected pieces. If X is not discrete, there may be        set A of X then A = (A H1 ) (A H2 ), where
other ways of doing this. For example, the space         X = H1 H2 with x ∈ H1 and y ∈ H2 is a discon-
                                                         nection of X (it is possible to choose H 1 and H2 in
        X = {x ∈ R : (0 ≤ x ≤ 1) ∨ (2  x  3)}
                                                         this manner because X is assumed to be totally dis-
has the following distinct decomposition into three      connected), is a separation of A that demonstrates
connected subsets:                                       that any subspace of a totally disconnected space
              1      1            7     7                with more than one point is disconnected.
    X = 0,             ,1      2,         ,3                  A totally disconnected space has interesting
              2      2            3     3
                                                         physically appealing separation properties in terms
                       ∞
                              1   1                      of the (separated) Hausdorff spaces; here a topo-
       X = {0}                  ,      (2, 3)
                             n+1 n                       logical space X is Hausdorff, or T2 , iff each two
                      n=1
                                                         distinct points of X can be separated by disjoint
       X = [0, 1]     (2, 3) .                           neighborhoods, so that for every x = y ∈ X, there
Intuition tells us that only in the third of these de-   are neighborhoods M ∈ Nx and N ∈ Ny such that
compositions have we really broken up X into its         M N = ∅. This means that for any two distinct
connected pieces. What distinguishes the third from      points x = y ∈ X, it is impossible to find points that
the other two is that neither of the pieces [0, 1] or    are arbitrarily close to both of them. Among the
Toward a Theory of Chaos   3225

properties of Hausdorff spaces, the following need             It should be noted that that as none of the prop-
to be mentioned.                                         erties (H1)–(H3) need neighborhoods of both points
                                                         simultaneously, it is sufficient for X to be T 1 for the
(H1) X is Hausdorff iff for each x ∈ X and any
                                                         conclusions to remain valid.
     point y = x, there is a neighborhood N of x
                                                              From its definition it follows that any totally
     such that y ∈ Cl(N ). This leads to the sig-
                                                         disconnected space is a Hausdorff space and is there-
     nificant result that for any x ∈ X the closed
                                                         fore both T1 and T0 spaces as well. However, if a
     singleton
                                                         Hausdorff space has a base of clopen sets then it is
                       {x} =           Cl(N )            totally disconnected; this is so because if x and y
                               N ∈Nx                     are distinct points of X, then the assumed property
     is the intersection of the closures of any local    of x ∈ H ⊆ M for every M ∈ Nx and some clopen
     base at that point, which in the language of        set M yields X = H (X − H) as a disconnection
     nets and filters (Appendix A.1) means that a         of X that separates x and y ∈ X − H; note that the
     net in a Hausdorff space cannot converge to          assumed Hausdorffness of X allows M to be chosen
     more than one point in the space and the ad-        so as not to contain y.
     herent set adh(Nx ) of the neighborhood filter
                                                         Example A.3.1
     at x is the singleton {x}.
(H2) Since each singleton is a closed set, each fi-       (1) Every indiscrete space is connected; every sub-
     nite set in a Hausdorff space is also closed in          set of an indiscrete space is connected. Hence
     X. Unlike a cofinite space, however, there can           if X is empty or a singleton, it is connected. A
     clearly be infinite closed sets in a Hausdorff            discrete space is connected iff it is either empty
     space.                                                  or is a singleton; the only connected subsets in
(H3) Any point x in a Hausdorff space X is a clus-            a discrete space are the degenerate ones. This is
     ter point of A ⊆ X iff every neighborhood                an extreme case of lack of connectedness, and a
     of x contains infinitely many points of A, a             discrete space is the simplest example of a total
     fact that has led to our mental conditioning            disconnected space.
     of the points of a (Cauchy) sequence piling up      (2) Q, the set of rationals considered as a subspace
     in neighborhoods of the limit. Thus suppose             of the real line, is (totally) disconnected because
     for the sake of argument that although some             all rationals larger than a given irrational r is a
     neighborhood of x contains only a finite num-            clopen set in Q, and
     ber of points, x is nonetheless a cluster point             Q = (−∞, r)         Q       Q    (r, ∞)
     of A. Then there is an open neighborhood U
     of x such that U (A−{x}) = {x1 , . . . , xn } is                  r is an irrational
     a finite closed set of X not containing x, and
                                                             is the union of two disjoint clopen sets in the
     U (X − {x1 , . . . , xn }) being the intersection
                                                             relative topology of Q. The sets (−∞, r) ∩ Q
     of two open sets, is an open neighborhood of x
                                                             and Q ∩ (r, ∞) are clopen in Q because neither
     not intersecting A−{x} implying thereby that
                                                             contains a cluster point of the other. Thus for
     x ∈ Der(A); infact U (X − {x1 , . . . , xn }) is
                                                             example, any neighborhood of the second must
     simply {x} if x ∈ A or belongs to Bdy X−A (A)
                                                             contain the irrational r in order to be able to cut
     when x ∈ X − A. Conversely if every neigh-
                                                             the first which means that any neighborhood
     borhood of a point of X intersects A in in-
                                                             of a point in either of the relatively open sets
     finitely many points, that point must belong
                                                             cannot be wholly contained in the other. The
     to Der(A) by definition.
                                                             only connected sets of Q are one point subsets
                                                             consisting of the individual rationals. In fact, a
    Weaker separation axioms than Hausdorffness               connected piece of Q, being a connected subset
are those of T0 , respectively T1 , spaces in which          of R, is an interval in R, and a nonempty in-
for every pair of distinct points at least one, re-          terval cannot be contained in Q unless it is a
spectively each one, has some neighborhood not               singleton. It needs to be noted that the individ-
containing the other; the following table is a list-         ual points of the rational line are not (cl)open
ing of the separation properties of some useful              because any open subset of R that contains a
spaces.                                                      rational must also contain others different from
3226    A. Sengupta

       it. This example shows that a space need not        spaces, the proof of which uses this contrapositive
       be discrete for each of its points to be a com-     characterization of compactness.
       ponent and thereby for the space to be totally
                                                           Theorem A.2.1. A topological space X is compact
       disconnected.
                                                           iff each class of closed subsets of X with finite in-
            In a similar fashion, the set of irrationals
                                                           tersection property has non-empty intersection.
       is (totally) disconnected because all the irra-
       tionals larger than a given rational is an exam-    Proof. Necessity. Let X be a compact space. Let
       ple of a clopen set in R − Q.                       F = {Fα }α∈D be a collection of closed subsets of X
(3)    The p-inclusion (A-inclusion) topology is con-      with finite FIP, and let G = {X − Fα }α∈D be the
       nected; a subset in this topology is connected                                             N
                                                           corresponding open sets of X. If {Gi }i=1 is a non-
       iff it is degenerate or contains p. For, a sub-      empty finite subcollection from G, then {X −G i }Ni=1
       set inherits the discrete topology if it does       is the corresponding non-empty finite subcollection
       not contain p, and p-inclusion topology if it       of F. Hence from the assumed finite intersection
       contains p.                                         property of F, it must be true that
(4)    The cofinite (cocountable) topology on an infi-              N           N
       nite (uncountable) space is connected; a subset
                                                            X−         Gi =         (X − Gi )   (DeMorgan’s Law)
       in a cofinite (cocountable) space is connected iff
                                                                 i=1          i=1
       it is degenerate or infinite (countable).
                                                                         = ∅,
(5)    Removal of a single point may render a con-
       nected space disconnected and even totally          so that no finite subcollection of G can cover X.
       disconnected. In the former case, the point         Compactness of X now implies that G too cannot
       removed is called a cut point and in the sec-       cover X and therefore
       ond, it is a dispersion point. Any real number                Fα =         (X − Gα ) = X −       Gα = ∅ .
       is a cut point of R and it does not have any              α            α                     α
       dispersion point only.                              The proof of the converse is a simple exercise of re-
(6)    Let X be a topological space. Considering com-      versing the arguments involving the two equations
       ponents of X as equivalence classes by the          in the proof above.
       equivalence relation ∼ with Q : X → X/ ∼
       denoting the quotient map, X/ ∼ is totally dis-          Our interest in this theorem and its proof lies in
       connected: As Q− ([x]) is connected for each        the following corollary — which essentially means
       [x] ∈ X/ ∼ in a component class of X, and           that for every filter F on a compact space the adher-
       as any open or closed subset A ⊆ X/ ∼ is con-       ent set adh(F) is not empty — from which it follows
       nected iff Q− (A) is open or closed, it must fol-    that every net in a compact space must have a con-
       low that A can only be a singleton.                 vergent subnet.

    The next notion of compactness in topological          Corollary.A space X is compact iff for every class
spaces provides an insight of the role of non-empty        A = (Aα ) of nonempty subsets of X with FIP,
adherent sets of filters that lead in a natural fash-       adh(A) = Aα ∈A Cl(Aα ) = ∅.
ion to the concept of attractors in the dynamical
systems theory that we take up next.                           The proof of this result for nets given by the
                                                           next theorem illustrates the general approach in
Definition A.3.4 (Compactness).     A topological           such cases which is all that is basically needed in
space X is compact iff every open cover of X con-           dealing with attractors of dynamical systems; com-
tains a finite subcover of X.                               pare Theorem A.1.3.
                                                           Theorem A.3.2. A topological space X is compact
     This definition of compactness has an useful           iff each net in X adheres to X.
equivalent contrapositive reformulation: For any
given collection of open sets of X if none of its          Proof. Necessity. Let X be a compact space, χ :
finite subcollections cover X, then the entire col-         D → X a net in X, and Rα the residual of α in the
lection also cannot cover X. The following theorem         directed set D. For the filter-base ( F Bχ(Rα ) )α∈D of
is a statement of the fundamental property of com-         nonempty, decreasing, nested subsets of X associ-
pact spaces in terms of adherences of filters in such       ated with the net χ, compactness of X requires from
Toward a Theory of Chaos      3227

      Cl(χ(Rα ) ⊇ χ(Rδ ) = ∅, that the uncountably
     α δ                                                             compactness of subspaces: A subspace K of a topo-
intersecting subset                                                  logical space X is compact iff each open cover of K
                                                                     in X contains a finite cover of K.
              adh(F Bχ ) :=         Cl(χ(Rα ))                            A proper understanding of the distinction be-
                              α∈D                                    tween compactness and closedness of subspaces —
of X be non-empty. If x ∈ adh(F Bχ ) then because                    which often causes much confusion to the non-
x is in the closure of χ(Rβ ), it follows from Eq. (20)              specialist — is expressed in the next two theorems.
that N χ(Rβ ) = ∅33 for every N ∈ Nx , β ∈ D.                        As a motivation for the first that establishes that
Hence χ(γ) ∈ N for some γ β so that x ∈ adh(χ);                      not every subset of a compact space need be com-
see Eq. (A.16).                                                      pact, mention may be made of the subset (a, b) of
     Sufficiency. Let χ be a net in X that adheres                     the compact closed interval [a, b] in R.
at x ∈ X. From any class F of closed subsets of                      Theorem A.3.3. A closed subset F of a compact
X with FIP, construct as in the proof of Theo-                       space X is compact.
rem A.1.4, a decreasing nested sequence of closed
subsets Cβ = α β∈D {Fα : Fα ∈ F} and consider                        Proof.  Let G be an open cover of F so that an
the directed set D Cβ = {(Cβ , β) : (β ∈ D)(xβ ∈                     open cover of X is G (X − F ), which because
Cβ )} with its natural direction (A.11) to define the                 of compactness of X contains a finite subcover U.
net χ(Cβ , β) = xβ in X; see Definition A.1.10. From                  Then U − (X − F ) is a finite collection of G that
the assumed adherence of χ at some x ∈ X, it                         covers F .
follows that N F = ∅ for every N ∈ Nx and
                                                                          It is not true in general that a compact subset
F ∈ F. Hence x belongs to the closed set F so that
                                                                     of a space is necessarily closed. For example, in an
x ∈ adh(F); see Eq. (A.24). Hence X is compact.
                                                                     infinite set X with the cofinite topology, let F be
                                                                     an infinite subset of X with X − F also infinite.
                                                                     Then although F is not closed in X, it is neverthe-
     Using Theorem A.1.5 that specifies a definite
                                                                     less compact because X is compact. Indeed, let G
criterion for the adherence of a net, this theorem
                                                                     be an open cover of X and choose any non-empty
reduces to the useful formulation that a space is
                                                                     G0 ∈ G. If G0 = X then {G0 } is the required fi-
compact iff each net in it has some convergent sub-
                                                                     nite cover of X. If this is not the case, then because
net. An important application is the following: Since
                                                                     X − G0 = {xi }n is a finite set, there is a Gi ∈ G
                                                                                      i=1
every decreasing sequence (Fm ) of nonempty sets
                                                                     with xi ∈ Gi for each 1 ≤ i ≤ n, and therefore
has FIP (because M Fm = FM for every finite
                      m=1                                            {Gi }n is the finite cover that demonstrates the
                                                                           i=0
M ), every decreasing sequence of nonempty closed                    compactness of the cofinite space X. Compactness
subsets of a compact space has nonempty intersec-                    of F now follows because the subspace topology on
tion. For a complete metric space this is known as                   F is the induced cofinite topology from X. The dis-
the Nested Set Theorem, and for [0, 1] and other                     tinguishing feature of this topology is that it, like
compact subspaces of R as the Cantor Intersection                    the cocountable, is not Hausdorff: If U and V are
Theorem.34                                                           any two nonempty open sets of X, then they can-
     For subspaces A of X, it is the relative topology               not be disjoint as the complements of the open sets
that determines as usual compactness of A; however                   can only be finite and if U V were to be indeed
the following criterion renders this test in terms of                empty, then
the relative topology unnecessary and shows that
the topology of X itself is sufficient to determine                    X = X − ∅ = X − (U          V ) = (X − U )       (X − V )

33
   This is of course a triviality if we identify each χ(Rβ ) (or F in the proof of the converse that follows) with a neighborhood
N of X that generates a topology on X.
34
   Nested-set theorem. If (En ) is a decreasing sequence of nonempty, closed, subsets of a complete metric space (X, d) such
that limn→∞ dia(En ) = 0, then there is a unique point
                                                               ∞
                                                          x∈         En .
                                                               n=0

The uniqueness arises because the limiting condition on the diameters of En imply, from property (H1), that (X, d) is a
Hausdorff space.
3228       A. Sengupta

would be a finite set. An immediate fallout of this is             Vy is a neighborhood of x and the intersection is
that in an infinite cofinite space, a sequence (x i )i∈N            over finitely many points y of A. To prove that K is
(and even a net) with xi = xj for i = j behaves in                closed in X it is enough to show that V is disjoint
an extremely unusual way: It converges, as in the                 from K: If there is indeed some z ∈ V K then z
indiscrete space, to every point of the space. Indeed             must be in some Uy for y ∈ A. But as z ∈ V it is
if x ∈ X, where X is an infinite set provided with                 also in Vy which is impossible as Uy and Vy are to
its cofinite topology, and U is any neighborhood of                be disjoint. This last part of the argument in fact
x, any infinite sequence (xi )i∈N in X must be even-               shows that if K is a compact subspace of a Haus-
tually in U because X − U is finite, and ignoring                  dorff space X and x ∈ K, then there are disjoint
                                                                                         /
of the initial set of its values lying in X − U in no             open sets U and V of X containing x and K.
way alters the ultimate behavior of the sequence
(note that this implies that the filter induced on                     The last two theorems may be combined to give
X by the sequence agrees with its topology). Thus                 the obviously important
xi → x for any x ∈ X is a reflection of the fact that
there are no small neighborhoods of any point of X                Corollary. In a compact Hausdorff space, closed-
with every neighborhood being almost the whole of                 ness and compactness of its subsets are equivalent
X, except for a null set consisting of only a finite               concepts.
number of points. This is in sharp contrast with
Hausdorff spaces where, although every finite set is                     In the absence of Hausdorffness, it is not pos-
also closed, every point has arbitrarily small neigh-             sible to conclude from the assumed compactness of
borhoods that lead to unique limits of sequences. A               the space that every point to which the net may
corresponding result for cocountable spaces can be                converge actually belongs to the subspace.
found in Example A.1.2, continued.
                                                                  Definition A.3.5. A subset D of a topological
     This example of the cofinite topology motivates
                                                                  space (X, U) is dense in X if Cl(D) = X. Thus
the following “converse” of the previous theorem.
                                                                  the closure of D is the largest open subset of X,
Theorem A.3.4. Every compact subspace of a                        and every neighborhood of any point of X contains
Hausdorff space is closed.                                         a point of D not necessarily distinct from it; refer
                                                                  to the distinction between Eqs. (20) and (22).
Proof.   Let K be a non-empty compact subset of
X, Fig. 23, and let x ∈ X − K. Because of the sep-                     Loosely, D is dense in X iff every point of X
aration of X, for every y ∈ K there are disjoint                  has points of D arbitrarily close to it. A self-dense
open subsets Uy and Vy of X with y ∈ Uy , and                     (dense in itself ) set is a set without any isolated
x ∈ Vy . Hence {Uy }y∈K is an open cover for K, and               points; hence A is self-dense iff A ⊆ Der(A). A
from its compactness there is a finite subset A of                 closed self-dense set is called a perfect set so that
K such that K ⊆ y∈A Uy with V = y∈A Vy an                         a closed set A is perfect iff it has no isolated points.
open neighborhood of x; V is open because each                    Accordingly
                                                                                A is perfect ⇔ A = Der(A) ,
                                                                  means that the closure of a set without any isolated
                                                                  points is a perfect set.

                                          #                       Theorem A.3.5. The following are equivalent
                                             .                  statements.
                                                      !
                                 1)$ ¢ ' $
                                0 (  %                           (1) D is dense in X.
                                                                  (2) If F is any closed set of X with D ⊆ F, then
                                                                      F = X; thus the only closed superset of D is
                       .                                              X.
                                   ©§¥£¡
                                     ¨¦ ¤¢                    (3) Every nonempty (basic) open set of X cuts D;
                                                                      thus the only open set disjoint from D is the
                                                                      empty set ∅.
Fig. 23.     Closedness of compact subsets of a Hausdorff space.   (4) The exterior of D is empty.
Toward a Theory of Chaos   3229


                                      XX
                                       X                                        XX
                                                                                 X                                  XX
                                                                                                                     X

                                                                        £¡£¡£¡£¤
                                                                         ¡¡¡¤£¥£¥
                                                                       ¥¡¥¡¥¡¥¤£                        ¨¡¨¤¨¡¨¡
                                                                                                         ¡¤¡¡¨©¨©
                                                                                                       ©¡©¤©¡©¡¨
                         Der(A) = = ∅
                          Der(A) = ∅
                           Der(A) ∅
                                                         ¢¡ ¢ ¢  £¡£¡£¤¥£¥£¥
                                                        ¡¢ ¥¡¥¡¥¤
                                                       ¢¡ £¡£¡£¤£¥
                                                      ¡  ¥¡¥¡¥¤
                                                      £ ¢¡ ¡¡¤
                                                     ¢¡¢ ¥¡¥¡¥¤£¥
                                                      ¥¡ ¥ ¥ ¥    £¡
                                                                      £¡£¡£¡£¤
                                                                     ¥¡¥¡¥¡¥¤
                                                                    £¡£ £ £
                                                                   ¥¡¡¡¤
                                                                 ¥¡¥ ¥ ¥
                                                                 ¡£¡£¡£¤ ¦§                           ¨¡¨¤¨¡¨¡©
                                                                                                    ©©¡©¤©¡©¡¨
                                                                                                 ¨¨©¡¨¤¨¡¨¡
                                                                                                    ¡©¤©¡©¡©¨
                                                                                                 ¡¨¤¨¡¨¡©¨©
                                                                                               ¨©¡©¤©¡©¡¨
                                                                                               ¡¨¤¨¡¨¡©
                            AA
                             A
                                                    ¡ £¡£¡£¤
                                                          AA
                                                           A
                                                                 ¡¡¡¤£
                                                                £¡£ £ £
                                                               ¥¡¡¡¤¥
                                                             ¥¡¥ ¥ ¥
                                                            £¡£¡                             ¨©¡©¤©¡©¡¨
                                                                                             CC= = Der(C)
                                                                                               C= Der(C)
                                                                                                 Der(C)
                                                                                             ¡¨¤¨¡¨¡©
                                                                                            ¨¡©¤©¡©¡
                                                                                           ¡¨¤¨¡¨¡
                                                                                           ©¡©¤©¡©¡
                                                                                          ¨¡¨¤¨¡¨¡
                                                          ¥                              © © © ©
                       (a) (a) Aisisolated
                        (a)A is isisolated
                         (a)AA isolated
                                     isolated         (b) A Aisnwd
                                                       (b) Ais isnwd
                                                         (b)       nwd
                                                        (b) A is nwd                       (c)(c)CisisCantor
                                                                                            (c) C is Cantor
                                                                                             (c)C C isCantor
                                                                                                        Cantor
Fig. 24. Shows the distinction between isolated, nowhere dense and Cantor sets. Topologically, the Cantor set can be described
as a perfect, nowhere dense, totally disconnected and compact subset of a space. (b) The closed nowhere dense set Cl(A) is
the boundary of its open complement. Here downward and upward inclined hatching denote respectively Bdy A (X − A) and
BdyX−A (A).



Proof. (3) If U indeed is a non-empty open set of                               closure, that is A ⊆ Cl(X − Cl(A)). In particu-
X with U D = ∅, then D ⊆ X − U = X leads                                        lar a closed subset A is nowhere dense in X iff
to the contradiction X = Cl(D) ⊆ Cl(X − U ) =                                   A = Bdy(A), that is iff it contains no open set.
X − U = X, which also incidentally proves (2).                              (2) From M ⊆ N ⇒ Cl(M ) ⊆ Cl(N ) it follows,
From (3) it follows that for any open set U of                                  with M = X − Cl(A) and N = X − A, that
X, Cl(U ) = Cl(U D) because if V is any open                                    a nowhere dense set is residual, but a residual
neighborhood of x ∈ Cl(U ) then V U is a non-                                   set need not be nowhere dense unless it is also
empty open set of X that must cut D so that                                     closed in X.
V (U D) = ∅ implies x ∈ Cl(U D). Finally,                                   (3) Since Cl(Cl(A)) = Cl(A), Cl(A) is nowhere
Cl(U D) ⊆ Cl(U ) completes the proof.                                           dense in X iff A is.
                                                                            (4) For any A ⊆ X, both Bdy A (X − A) := Cl(X −
Definition A.3.6. (a) A set A ⊆ X is said to be                                  A) A and BdyX−A (A) := Cl(A) (X −
nowhere dense in X if Int(Cl(A)) = ∅ and residual                               A) are residual sets and as Fig. 22 shows
in X if Int(A) = ∅.                                                             BdyX (A) = BdyX−A (A) BdyA (X − A) is the
    A is nowhere dense in X iff                                                  union of these two residual sets. When A is
       Bdy(X − Cl(A)) = Bdy(Cl(A)) = Cl(A)                                      closed (or open) with X its boundary, con-
so that                                                                         sisting of the only component Bdy A (X − A)
                                                                                (or BdyX−A (A)) as shown by the second row
   Cl(X − Cl(A)) = (X − Cl(A))             Cl(A) = X                            (or column) of the figure, being a closed set
from which it follows that                                                      of X is also nowhere dense in X; in fact a
                                                                                closed nowhere dense set is always the bound-
      A is nwd in X ⇔ X − Cl(A) is dense in X
                                                                                ary of some open set. Otherwise, the bound-
and                                                                             ary components of the two residual parts —
      A is residual in X ⇔ X − A is dense in X .                                as in the donor–donor, donor–neutral, neutral–
                                                                                donor and neutral–neutral cases — need not
     Thus A is nowhere dense iff Ext(A) := X −
                                                                                be individually closed in X (although their
Cl(A) is dense in X, and in particular, a closed set
                                                 111
                                                                                union is) and their union is a residual set that
is nowhere dense in X iff its complement is open
                                                                                need not be nowhere dense in X: the union
dense in X with open-denseness being complimen-
                                                                                of two nowhere dense sets is nowhere dense
tarily dual to closed-nowhere denseness. The ratio-
                                                                                but the union of a residual and a nowhere
nals in reals is an example of a set that is residual
                                                                                dense set is a residual set. One way in which
but not nowhere dense. The following are readily
                                                                                a two-component boundary can be nowhere
verifiable properties of subsets of X.
                                                                                dense is by having BdyA (X − A) ⊇ Der(A) or
(1) A set A ⊆ X is nowhere dense in X iff it is                                  BdyX−A (A) ⊇ Der(X − A), so that it is effec-
    contained in its own boundary, iff it is con-                                tively in one piece rather than in two, as show in
    tained in the closure of the complement of its                              Fig. 24(b).
3230   A. Sengupta

Theorem A.3.6. A is nowhere dense in X iff each             x1
                                                               (0)
                                                                                                                                         x2
                                                                                                                                           (0)
non-empty open set of X has a non-empty open sub-          .                                                                               .     C0
                                                                                           (1)                  (1)
set disjoint from Cl(A).                                                                  x2                   x3
                                                                                              .                .                                 C1
                                                                         (2)        (2)                                 (2)          (2)
                                                                     x2         x3                                     x6        x7
Proof. If U is a non-empty open set of X, then                           .      .                                       .        .               C2
U0 = U ∩ Ext(A) = ∅ as Ext(A) is dense in X; U 0                      (3)
                                                                     x3
                                                                                           (3)
                                                                                          x7
                                                                                                                (3)
                                                                                                               x10
                                                                                                                                 (3)
                                                                                                                                x14
is the open subset that is disjoint from Cl(A). It                   .                    .                        .                 .           C3
clearly follows from this that each non-empty open
                                                                                                                                                 C4
set of X has a non-empty open subset disjoint from
a nowhere dense set A.
                                                                                                                                                 C
    What this result (which follows just from the
                                                          Fig. 25. Construction of the classical 1/3-Cantor set. The
definition of nowhere dense sets) actually means is                                                                                            1
                                                          end points of C3 , for example, in increasing order are: |0, 27 |;
that no point in BdyX−A (A) can be isolated in it.        | 27 , 9 |; | 2 , 27 |; | 27 , 1 |; | 2 , 19 |; | 27 , 7 |; | 9 , 25 |; | 26 , 1|. Ci is
                                                            2 1              7       8                      20          8
                                                                        9                3      3 27             9          27      27
                                                          the union of 2i pairwise disjoint closed intervals each of length
Corollary.  A is nowhere dense in X iff Cl(A) does         3−i and the non-empty infinite intersection C = ∩∞ Ci                             i=0
not contain any non-empty open set of X iff any            is the adherent Cantor set of the filter-base of closed sets
nonempty open set that contains A also contains its       {C0 , C1 , C2 , . . .}.
closure.

Example A.3.2. Each finite subset of Rn is                 number — it follows that both rationals and irra-
nowhere dense in Rn ; the set {1/n}∞ is nowhere           tionals belong to the Cantor set.
                                       n=1
dense in R. The Cantor set C is nowhere dense in          (C1) C is totally disconnected. If possible, let C
[0, 1] because every neighborhood of any point in              have a component containing points a and b
C must contain, by its very construction, a point              with a  b. Then [a, b] ⊆ C ⇒ [a, b] ⊆ Ci for
with 1 in its ternary representation. That the in-             all i. But this is impossible because we may
terior and the interior of the closure of a set are            choose i large enough to have 3−i  b − a
not necessarily the same is seen in the example                so that a and b must belong to two differ-
of the rationals in reals: The set of rational num-            ent members of the pairwise disjoint closed
bers Q has empty interior because any neighbor-                2i subintervals each of length 3−i that consti-
hood of a rational number contains irrational num-             tutes Ci . Hence
bers (so also is the case for irrational numbers) and
R = Int(Cl(Q)) ⊇ Int(Q) = ∅ justifies the notion of                             [a, b] is not a subset of any Ci ⇒ [a, b]
a nowhere dense set.                                                              is not a subset of C .

     The following properties of C can be taken to        (C2) C is perfect so that for any x ∈ C every
define any subset of a topological space as a Can-              neighborhood of x must contain some other
tor set; set-theoretically it should be clear from its         point of C. Supposing to the contrary that the
classical middle-third construction that the Cantor            singleton {x} is an open set of C, there must
set consists of all points of the closed interval [0,          be an ε  0 such 1 that in the usual topology
1] whose infinite triadic (base 3) representation, ex-          of R
pressed so as not to terminate with an infinite string                                     {x} = C     (x − ε, x + ε) .                   (A.52)
of 1’s, does not contain the digit 1. Accordingly, any
end point of the infinite set of closed intervals whose                   Choose a positive integer i large enough to
intersection yields the Cantor set, is represented by                    satisfy 3−i  ε. Since x is in every Ci , it must
a repeating string of either 0 or 2 while a non end                      be in one of the 2i pairwise disjoint closed in-
point has every other arbitrary collection of these                      tervals [a, b] ⊂ (x − ε, x + ε) each of length
two digits. Recalling that any number in [0, 1] is a                     3−i whose union is Ci . As [a, b] is an interval,
rational iff its representation in any base is termi-                     at least one of the end points of [a, b] is dif-
nating or recurring — thus any decimal that neither                      ferent from x, and since an end point belongs
repeats or terminates but consists of all possible se-                   to C, C ∩ (x − ε, x + ε) must also contain this
quences of all possible digits represents an irrational                  point thereby violating Eq. (A.52).
Toward a Theory of Chaos       3231

(C3) C is nowhere dense because each neighbor-          where λ(ν) is the usual combination coefficient
     hood of any point of C intersects Ext(C); see      of the solutions of the homogeneous and non-
     Theorem A.3.6.                                     homogeneous parts of a linear equation, P(·) is a
(C4) C is compact because it is a closed subset         principal value and δ(x) the Dirac delta, to lead
     contained in the compact subspace [0, 1] of        to the full-range −1 ≤ µ ≤ 1 solution valid for
     R, see Theorem A.3.3. The compactness of           −∞  x  ∞
     [0, 1] follows from the Heine-Borel Theorem              Φ(x, µ) = a(ν0 )e−x/ν0 φ(µ, ν0 )
     which states that any subset of the real line
     is compact iff it is both closed and bounded                          + a(−ν0 )ex/ν0 φ(−ν0 , µ)
     with respect to the Euclidean metric on R.                                       1
                                                                          +               a(ν)e−x/ν φ(µ, ν)dν         (A.56)
    Compare (C1) and (C2) with the essentially                                    −1
similar arguments of Example A.3.1(2) for the sub-      of the one-speed neutron transport equation (A.53).
space of rationals in R.                                Here the real ν0 and ν satisfy respectively the inte-
                                                        gral constraints
A.4.    Neutron Transport Theory                                       cν0 ν0 + 1
                                                                          ln        = 1,              |ν0 |  1
                                                                        2    ν0 − 1
This section introduces the reader to the basics of
the linear neutron transport theory where graphi-                                 cν 1 + ν
                                                                 λ(ν) = 1 −          ln     ,         ν ∈ [−1, 1] ,
cal convergence approximations to the singular dis-                                2    1−ν
tributions, interpreted here as multifunctions, led
                                                        with
to the study of this paper. The one-speed (that is
mono-energetic) neutron transport equation in one                                               cν0 1
                                                                          φ(µ, ν0 ) =
dimension and plane geometry, is                                                                 2 ν0 − µ
                                  1                     following from Eq. (A.55).
      ∂Φ(x, µ)             c
    µ          + Φ(x, µ) =            Φ(x, µ )dµ ,           It can be shown [Case  Zweifel, 1967] that the
        ∂x                 2    −1                      eigenfunctions φ(ν, µ) satisfy the full-range orthog-
               0  c  1,   −1 ≤ µ ≤ 1                  onality condition
                                               (A.53)            1
                                                                     µφ(ν, µ)φ(ν , µ)dµ = N (ν)δ(ν − ν ) ,
where x is a non-dimensional physical space variable            −1
that denotes the location of the neutron moving in      where the odd normalization constants N are given
a direction θ = cos−1 (µ), Φ(x, µ) is a neutron den-    by
sity distribution function such that Φ(x, µ)dxdµ is                           1
the expected number of neutrons in a distance dx           N (±ν0 ) =             µφ2 (±ν0 , µ)dµ         for |ν0 |  1
about the point x moving at constant speed with                              −1
their direction cosines of motion in dµ about µ,                              cν03              c    1
and c is a physical constant that will be taken to                     =±                   2      − 2      ,
                                                                               2           ν0   − 1 ν0
satisfy the restriction shown above. Case’s method
starts by assuming the solution to be of the form       and
Φν (x, µ) = e−x/µ φ(µ, ν) with a normalization inte-                                        πcν   2
                      1                                  N (ν) = ν λ2 (ν) +                              for ν ∈ [−1, 1] .
gral constraint of −1 φ(µ, ν)dµ = 1 to lead to the                                           2
simple equation
                                                        With a source of particles ψ(x0 , µ) located at x =
                             cν                         x0 in an infinite medium, Eq. (A.56) reduces to the
            (ν − µ)φ(µ, ν) =                   (A.54)
                              2                         boundary condition, with µ, ν ∈ [−1, 1],
for the unknown function φ(ν, µ). Case then sug-         ψ(x0 , µ) = a(ν0 )e−x0 /ν0 φ(µ, ν0 )
gested, see [Case  Zweifel, 1967], the non-simple
complete solution of this equation to be                                 + a(−ν0 )ex0 /ν0 φ(−ν0 , µ)
                                                                                  1
               cν   1
   φ(µ, ν) =      P   + λ(v)δ(v − µ) ,         (A.55)                    +            a(ν)e−x0 /ν φ(µ, ν)dν           (A.57)
                2 ν−µ                                                          −1
3232   A. Sengupta

for the determination of the expansion coefficients                                  1
a(±ν0 ), {a(ν)}ν∈[−1,1] . Use of the above orthogonal-                                 W (µ)φ(µ, ν )φ(µ, −ν)dµ
                                                                               0
ity integrals then lead to the complete solution of
the problem to be                                                                          cν
                                                                                       =      (ν0 + ν)φ(ν , −ν)X(−ν)
                                                                                            2
                                  1
                    ex0 /ν
       a(ν) =                          µψ(x0 , µ)φ(µ, ν)dµ ,            where the half-range weight function W (µ) is
                    N (ν)        −1
                                                                        defined as
                    ν = ±ν0 or ν ∈ [−1, 1] .                                                    cµ
                                                                             W (µ) =                           (A.61)
For example, in the infinite-medium Greens func-                                      2(1 − c)(ν0 + µ)X(−µ)
tion problem with x0 = 0 and ψ(x0 , µ) = δ(µ−                           in terms of the X-function
µ0 )/µ, the coefficients are a(±ν0 ) = φ(µ0 , ±ν0 )/
N (±ν0 ) when ν = ±ν0 , and a(ν) = φ(µ0 , ν)/N (ν)                                                   1
                                                                                             c            ν         cν 2
for ν ∈ [−1, 1].                                                        X(−µ) = exp −
                                                                                             2           N (ν)
                                                                                                               1+
                                                                                                                  1 − ν2
                                                                                                                         ln(ν + µ)dν      ,
                                                                                                 0
     For a half-space 0 ≤ x  ∞, the obvious reduc-
tion of Eq. (A.56) to                                                                                    0 ≤ µ ≤ 1,

   Φ(x, µ) = a(ν0 )e−x/ν0 φ(µ, ν0 )                                     that is conveniently obtained from a numerical so-
                            1                                           lution of the nonlinear integral equation
                    +           a(ν)e−x/ν φ(µ, ν)dν            (A.58)                                         1
                        0                                                                    cµ                        ν0 (1−c)−ν 2
                                                                                                                        2
                                                                        Ω(−µ) = 1−                                  2                  dν
with boundary condition, µ, ν ∈ [0, 1],                                                    2(1−c)         0       (ν0 −ν 2 )(µ+ν)Ω(−ν)
                                                                                                                                    (A.62)
 ψ(x0 , µ) = a(ν0 )e−x0 /ν0 φ(µ, ν0 )
                            1
                                                                        to yield
                                        −x0 /ν
                +               a(ν)e            φ(µ, ν)dν ,   (A.59)                                            Ω(−µ)
                        0                                                                  X(−µ) =                  √      ,
                                                                                                              µ + ν0 1 − c
leads to an infinitely more difficult determination of
the expansion coefficients due to the more involved                       and X(±ν0 ) satisfy
nature of the orthogonality relations of the eigen-
                                                                                                                      2
                                                                                                                    ν0 (1 − c) − 1
functions in the half-interval [0, 1] that now reads
                                                                                X(ν0 )X(−ν0 ) =                                       .
for ν, ν ∈ [0, 1] [Case  Zweifel, 1967]                                                                                   2 2
                                                                                                                  2(1 − c)v0 (ν0 − 1)
                1
                    W (µ)φ(µ, ν )φ(µ, ν)dµ                              Two other useful relations involving the W -function
                                                                                          1
            0                                                           are given by 0 W (µ)φ(µ, ν0 )dµ = cν0 /2 and
                                                                          1
                        W (ν)N (ν)                                       0 W (µ)φ(µ, ν)dµ = cν/2.
                    =              δ(ν − ν )                                 The utility of these full- and half-range orthog-
                            ν
                1                                                       onality relations lie in the fact that a suitable class
                    W (µ)φ(µ, ν0 )φ(µ, ν)dµ = 0                         of functions of the type that is involved here can al-
            0
                                                                        ways be expanded in its terms, see [Case  Zweifel,
                1
                    W (µ)φ(µ, −ν0 )φ(µ, ν)dµ                            1967]. An example of this for a full-range problem
            0                                                           has been given above; we end this introduction to
                    = cνν0 X(−ν0 )φ(ν, −ν0 )                   (A.60)   the generalized — traditionally known as singular in
                1                                                       neutron transport theory — eigenfunction method
                    W (µ)φ(µ, ±ν0 )φ(µ, ν0 )dµ                          with two examples of half-range orthogonality inte-
            0                                                           grals to the half-space problems A and B of Sec. 5.
                                 cν0     2
                    =                        X(±ν0 )
                                  2                                     Problem A (The Milne Problem). In this case
                1
                    W (µ)φ(µ, ν0 )φ(µ, −ν)dµ                            there is no incident flux of particles from outside
            0                                                           the medium at x = 0, but for large x  0 the
                        c2 νν0                                          neutron distribution inside the medium behaves
                    =          X(−ν)
                           4                                            like ex/ν0 φ(−ν0 , µ). Hence the boundary condition
Toward a Theory of Chaos     3233

(A.59) at x = 0 reduces to                                       which leads, using the integral relations satisfied by
                                                                 W , to the expansion coefficients
  −φ(µ, −ν0 ) = aA (ν0 )φ(µ, ν0 )
                                                                   aB (ν0 ) = −2/cν0 X(v0 )
                           1
                                                                                1                                   (A.64)
                   +           aA (ν)φ(µ, ν)dν           µ≥0        aB (ν) =        (1 − c)ν(ν0 + ν)X(−ν) .
                       0                                                      N (ν)
Use of the fourth and third equations of Eq. (A.60)              where X(±ν0 ) are related to Problem A as
and the explicit relation Eq. (A.61) for W (µ) gives                                           2
respectively the coefficients                                                         1         ν0 (1 − c) − 1
                                                                         X(ν0 ) =                          2
                                                                                    ν0   2aA (ν0 )(1 − c)(ν0 − 1)
                X(−ν0 )
   aA (ν0 ) =                                                                                      2
                X(v0 )                                                              1    aA (ν0 ) ν0 (1 − c) − 1
                                                                       X(−ν0 ) =                        2        .
                   1                                                                ν0      2(1 − c)(ν0 − 1)
                                2
    aA (ν) = −         c(1 − c)ν0 νX(−ν0 )X(−ν) .
                 N (ν)                                               This brief introduction to the singular eigen-
                                             (A.63)              function method should convince the reader of the
                                                                 great difficulties associated with half-space, half-
The extrapolated end point z0 of Eq. (67) is re-                 range methods in particle transport theory; note
lated to aA (ν0 ) of the Milne problem by aA (ν0 ) =             that the X-functions in the coefficients above must
− exp(−2z0 /ν0 ).                                                be obtained from numerically computed tables. In
                                                                 contrast, full-range methods are more direct due to
Problem B (The Constant Source Problem).                  Here   the simplicity of the weight function µ, which sug-
the boundary condition at x = 0 is                               gests the full-range formulation of half-range prob-
                                   1
                                                                 lems presented in Sec. 5. Finally it should be men-
1 = aB (ν0 )φ(µ, ν0 ) +                aB (ν)φ(µ, ν)dν    µ≥0    tioned that this singular eigenfunction method is
                               0                                 based on the theory of singular integral equations.

More Related Content

PDF
Gravity as entanglement, and entanglement as gravity
PDF
What is quantum information? Information symmetry and mechanical motion
PDF
Quantum phenomena modeled by interactions between many classical worlds
PDF
IS A FIELD AN INTELLIGENT SYSTEM?
PDF
How we look at photographs 1 (JSPS&TJ 2005)
PDF
Consciousness-Holomatrix theory
PDF
Chemical Organisation Theory
PDF
Measuring pictorial balance perception at first glance using Japanese Calligr...
Gravity as entanglement, and entanglement as gravity
What is quantum information? Information symmetry and mechanical motion
Quantum phenomena modeled by interactions between many classical worlds
IS A FIELD AN INTELLIGENT SYSTEM?
How we look at photographs 1 (JSPS&TJ 2005)
Consciousness-Holomatrix theory
Chemical Organisation Theory
Measuring pictorial balance perception at first glance using Japanese Calligr...

Viewers also liked (7)

PDF
Mecanica quantica
PDF
The control of chaos
PDF
SOLID WORK DRAWING
PDF
427 chess combinations (collection)
PPTX
Engineering drawing notes_b
PDF
Alburt, lev & dzindzichashvili & perelshteyn chess openings for black, expl...
PDF
Partial differential equations and complex analysis
Mecanica quantica
The control of chaos
SOLID WORK DRAWING
427 chess combinations (collection)
Engineering drawing notes_b
Alburt, lev & dzindzichashvili & perelshteyn chess openings for black, expl...
Partial differential equations and complex analysis
Ad

Similar to Toward a theory of chaos (20)

PDF
Chaos theory
PDF
Science and development of chaos theory
PDF
Dynamical Chaos Michael V. Berry (Editor)
PDF
The Topology of Chaos Alice in Stretch and Squeezeland 1st Edition Robert Gil...
PDF
20080821 beauty paper-geneva-original-1
PDF
Foundations of complex systems Nonlinear dynamic statistical physics informat...
PDF
Foundations of complex systems Nonlinear dynamic statistical physics informat...
PDF
Dynamical Chaos Michael V. Berry (Editor)
PDF
Chaos Theory
PDF
Nonlinear Differential Equations And Chaos Pei Yu Christopher Essex
PDF
Get Dynamical Chaos Michael V. Berry (Editor) free all chapters
PDF
Download full Dynamical Chaos Michael V. Berry (Editor) ebook all chapters
PDF
Foundations of complex systems Nonlinear dynamic statistical physics informat...
PDF
Order, Chaos and the End of Reductionism
PDF
Fundamental Characteristics of a Complex System
PDF
Foundations of complex systems Nonlinear dynamic statistical physics informat...
PDF
Untangling complex systems - Al Complexity Literacy Meeting le slides del lib...
PDF
Chaos Theory: An Introduction
PDF
OBC | Complexity science and the role of mathematical modeling
PDF
Pheade 2011
Chaos theory
Science and development of chaos theory
Dynamical Chaos Michael V. Berry (Editor)
The Topology of Chaos Alice in Stretch and Squeezeland 1st Edition Robert Gil...
20080821 beauty paper-geneva-original-1
Foundations of complex systems Nonlinear dynamic statistical physics informat...
Foundations of complex systems Nonlinear dynamic statistical physics informat...
Dynamical Chaos Michael V. Berry (Editor)
Chaos Theory
Nonlinear Differential Equations And Chaos Pei Yu Christopher Essex
Get Dynamical Chaos Michael V. Berry (Editor) free all chapters
Download full Dynamical Chaos Michael V. Berry (Editor) ebook all chapters
Foundations of complex systems Nonlinear dynamic statistical physics informat...
Order, Chaos and the End of Reductionism
Fundamental Characteristics of a Complex System
Foundations of complex systems Nonlinear dynamic statistical physics informat...
Untangling complex systems - Al Complexity Literacy Meeting le slides del lib...
Chaos Theory: An Introduction
OBC | Complexity science and the role of mathematical modeling
Pheade 2011
Ad

Toward a theory of chaos

  • 1. Tutorials and Reviews International Journal of Bifurcation and Chaos, Vol. 13, No. 11 (2003) 3147–3233 c World Scientific Publishing Company TOWARD A THEORY OF CHAOS A. SENGUPTA Department of Mechanical Engineering, Indian Institute of Technology Kanpur, Kanpur 208016, India osegu@iitk.ac.in Received February 23, 2001; Revised September 19, 2002 This paper formulates a new approach to the study of chaos in discrete dynamical systems based on the notions of inverse ill-posed problems, set-valued mappings, generalized and multi- valued inverses, graphical convergence of a net of functions in an extended multifunction space [Sengupta & Ray, 2000], and the topological theory of convergence. Order, chaos and complexity are described as distinct components of this unified mathematical structure that can be viewed as an application of the theory of convergence in topological spaces to increasingly nonlinear mappings, with the boundary between order and complexity in the topology of graphical con- vergence being the region in (Multi(X)) that is susceptible to chaos. The paper uses results from the discretized spectral approximation in neutron transport theory [Sengupta, 1988, 1995] and concludes that the numerically exact results obtained by this approximation of the Case singular eigenfunction solution is due to the graphical convergence of the Poisson and conjugate Poisson kernels to the Dirac delta and the principal value multifunctions respectively. In (Multi(X)), the continuous spectrum is shown to reduce to a point spectrum, and we introduce a notion of latent chaotic states to interpret superposition over generalized eigenfunctions. Along with these latent states, spectral theory of nonlinear operators is used to conclude that nature supports complexity to attain efficiently a multiplicity of states that otherwise would remain unavailable to it. Keywords: Chaos; complexity; ill-posed problems; graphical convergence; topology; multifunc- tions. Prologue of study of so-called “strongly ” nonlinear system. 1. Generally speaking, the analysis of chaos is ex- . . . Linearity means that the rule that determines what tremely difficult. While a general definition for chaos a piece of a system is going to do next is not influ- applicable to most cases of interest is still lacking, enced by what it is doing now. More precisely this mathematicians agree that for the special case of iter- is intended in a differential or incremental sense: For ation of transformations there are three common char- a linear spring, the increase of its tension is propor- acteristics of chaos: tional to the increment whereby it is stretched, with the ratio of these increments exactly independent of 1. Sensitive dependence on initial conditions, how much it has already been stretched. Such a spring 2. Mixing, 3. Dense periodic points. can be stretched arbitrarily far . . . . Accordingly no real spring is linear. The mathematics of linear objects is [Peitgen, Jurgens & Saupe, 1992] particularly felicitous. As it happens, linear objects en- 2. The study of chaos is a part of a larger program joy an identical, simple geometry. The simplicity of 3147
  • 2. 3148 A. Sengupta this geometry always allows a relatively easy mental 5. One of the most striking aspects of physics image to capture the essence of a problem, with the is the simplicity of its laws. Maxwell’s equations, technicality, growing with the number of parts, basi- Schroedinger’s equations, and Hamilton mechanics cally a detail. The historical prejudice against nonlinear can each be expressed in a few lines. . . . Everything problems is that no so simple nor universal geometry is simple and neat except, of course, the world. Every usually exists. place we look outside the physics classroom we see a Mitchell Feigenbaum’s Foreword (pp. 1–7) world of amazing complexity. . . . So why, if the laws in [Peitgen et al., 1992] are so simple, is the world so complicated? To us com- plexity means that we have structure with variations. 3. The objective of this symposium is to explore the Thus a living organism is complicated because it has impact of the emerging science of chaos on various dis- many different working parts, each formed by varia- ciplines and the broader implications for science and tions in the working out of the same genetic coding. society. The characteristic of chaos is its universality Chaos is also found very frequently. In a chaotic world and ubiquity. At this meeting, for example, we have it is hard to predict which variation will arise in a given scholars representing mathematics, physics, biology, place and time. A complex world is interesting because geophysics and geophysiology, astronomy, medicine, it is highly structured. A chaotic world is interesting psychology, meteorology, engineering, computer sci- because we do not know what is coming next. Our ence, economics and social sciences. 1 Having so many world is both complex and chaotic. Nature can pro- disciplines meeting together, of course, involves the duce complex structures even in simple situations and risk that we might not always speak the same lan- obey simple laws even in complex situations. guage, even if all of us have come to talk about [Goldenfeld & Kadanoff, 1999] “chaos”. 6. Where chaos begins, classical science stops. For as Opening address of Heitor Gurgulino de Souza, long as the world has had physicists inquiring into Rector United Nations University, Tokyo the laws of nature, it has suffered a special ignorance [de Souza, 1997] about disorder in the atmosphere, in the turbulent sea, 4. The predominant approach (of how the different in the fluctuations in the wildlife populations, in the fields of science relate to one other ) is reductionist: oscillations of the heart and the brain. But in the 1970s Questions in physical chemistry can be understood a few scientists began to find a way through disor- in terms of atomic physics, cell biology in terms of der. They were mathematicians, physicists, biologists, how biomolecules work . . . . We have the best of rea- chemists . . . (and) the insights that emerged led di- sons for taking this reductionist approach: it works. rectly into the natural world: the shapes of clouds, But shortfalls in reductionism are increasingly appar- the paths of lightning, the microscopic intertwining ent (and) there is something to be gained from sup- of blood vessels, the galactic clustering of stars. . . . plementing the predominantly reductionist approach Chaos breaks across the lines that separate scientific with an integrative agenda. This special section on disciplines, (and) has become a shorthand name for a complex systems is an initial scan (where) we have fast growing movement that is reshaping the fabric of taken a “complex system” to be one whose properties the scientific establishment. are not fully explained by an understanding of its com- [Gleick, 1987] ponent parts. Each Viewpoint author 2 was invited to define “complex” as it applied to his or her discipline. 7. order (→) complexity (→) chaos. [Gallagher & Appenzeller, 1999] [Waldrop, 1992] 1 A partial listing of papers is as follows: Chaos and Politics: Application of Nonlinear Dynamics to Socio-Political issues; Chaos in Society: Reflections on the Impact of Chaos Theory on Sociology; Chaos in Neural Networks; The Impact of Chaos on Mathematics; The Impact of Chaos on Physics; The Impact of Chaos on Economic Theory; The Impact of Chaos on Engineering; The impact of Chaos on Biology; Dynamical Disease: And The Impact of Nonlinear Dynamics and Chaos on Cardiology and Medicine. 2 The eight Viewpoint articles are titled: Simple Lessons from Complexity; Complexity in Chemistry; Complexity in Biolog- ical Signaling Systems; Complexity and the Nervous System; Complexity, Pattern, and Evolutionary Trade-Offs in Animal Aggregation; Complexity in Natural Landform Patterns; Complexity and Climate, and Complexity and the Economy.
  • 3. Toward a Theory of Chaos 3149 8. Our conclusions based on these examples seem sim- essary that we have a mathematically clear physi- ple: At present chaos is a philosophical term, not a cal understanding of these notions that are suppos- rigorous mathematical term. It may be a subjective edly reshaping our view of nature. This paper is an notion illustrating the present day limitations of the attempt to contribute to this goal. To make this human intellect or it may describe an intrinsic prop- account essentially self-contained we include here, erty of nature such as the “randomness” of the se- as far as this is practical, the basics of the back- quence of prime numbers. Moreover, chaos may be ground material needed to understand the paper in undecidable in the sense of Godel in that no matter the form of Tutorials and an extended Appendix. what definition is given for chaos, there is some ex- The paradigm of chaos of the kneading of the ample of chaos which cannot be proven to be chaotic dough is considered to provide an intuitive basis from the definition. of the mathematics of chaos [Peitgen et al., 1992], [Brown & Chua, 1996] and one of our fundamental objectives here is to re- count the mathematical framework of this process 9. My personal feeling is that the definition of a “frac- in terms of the theory of ill-posed problems arising tal” should be regarded in the same way as the biolo- from non-injectivity [Sengupta, 1997], maximal ill- gist regards the definition of “life”. There is no hard posedness, and graphical convergence of functions and fast definition, but just a list of properties char- [Sengupta & Ray, 2000]. A natural mathematical acteristic of a living thing . . . . Most living things have formulation of the kneading of the dough in the most of the characteristics on the list, though there form of stretch-cut-and-paste and stretch-cut-and- are living objects that are exceptions to each of them. fold operations is in the ill-posed problem arising In the same way, it seems best to regard a fractal as from the increasing non-injectivity of the function a set that has properties such as those listed below, f modeling the kneading operation. rather than to look for a precise definition which will certainly exclude some interesting cases. [Falconer, 1990] Begin Tutorial 1: Functions and 10. Dynamical systems are often said to exhibit chaos Multifunctions without a precise definition of what this means. A relation, or correspondence, between two sets X [Robinson, 1999] and Y , written M: X –→ Y , is basically a rule that → associates subsets of X to subsets of Y ; this is often 1. Introduction expressed as (A, B) ∈ M where A ⊂ X and B ⊂ Y The purpose of this paper is to present an unified, and (A, B) is an ordered pair of sets. The domain self-contained mathematical structure and physical def understanding of the nature of chaos in a discrete D(M) = {A ⊂ X : (∃Z ∈ M)(πX (Z) = A)} dynamical system and to suggest a plausible expla- and range nation of why natural systems tend to be chaotic. def The somewhat extensive quotations with which we R(M) = {B ⊂ Y : (∃Z ∈ M)(πY (Z) = B)} begin above, bear testimony to both the increas- of M are respectively the sets of X which under ingly significant — and perhaps all-pervasive — M corresponds to sets in Y ; here πX and (πY ) role of nonlinearity in the world today as also our are the projections of Z on X and Y , respectively. imperfect state of understanding of its manifesta- Equivalently, (D(M) = {x ∈ X : M(x) = ∅}) and tions. The list of papers at both the UN Confer- (R(M) = x∈D(M) M(x)). The inverse M− of M ence [de Souza, 1997] and in Science [Gallagher & is the relation Appenzeller, 1999] is noteworthy if only to justify M− = {(B, A) : (A, B) ∈ M} the observation of Gleick [1987] that “chaos seems to be everywhere”. Even as everybody appears to so that M− assigns A to B iff M assigns B to A. be finding chaos and complexity in all likely and In general, a relation may assign many elements in unlikely places, and possibly because of it, it is nec- its range to a single element from its domain; of
  • 4. 3150 A. Sengupta ¨ ¡ £ ¤¢ ¤ © £ $ # ! ¥ ¦¢ § ¦ ¥   § (a) (a) (a) (b) (b) (b) 3 4 ( )' 6 9 @ 1 0 % 2 ' 7 8 5 (c) (c) (c) (d) (d) (d) Fig. 1. Functional and non-functional relations between two sets X and Y : while f and g are functional relations, M is not. (a) f and g are both injective and surjective (i.e. they are bijective), (b) g is bijective but f is only injective and f −1 ({y2 }) := ∅, (c) f is not 1:1, g is not onto, while (d) M is not a function but is a multifunction. especial significance are functional relations f 3 that linear homogeneous differential equation with con- can assign only a unique element in R(f ) to any stant coefficients of order n 1 has n linearly element in D(f ). Figure 1 illustrates the distinc- independent solutions so that the operator D n of tion between arbitrary and functional relations M D n (y) = 0 has a n-dimensional null space. Inverses and f . This difference between functions (or maps) of non-injective, and in general non-bijective, func- and multifunctions is basic to our development and tions will be denoted by f − . If f is not injective should be fully understood. Functions can again be then classified as injections (or 1:1) and surjections (or def A ⊂ f − f (A) = sat(A) onto). f : X → Y is said to be injective (or one-to- one) if x1 = x2 ⇒ f (x1 ) = f (x2 ) for all x1 , x2 ∈ X, where sat(A) is the saturation of A ⊆ X induced by while it is surjective (or onto) if Y = f (X). f is f ; if f is not surjective then bijective if it is both 1:1 and onto. f f − (B) := B f (X) ⊆ B. Associated with a function f : X → Y is its in- verse f −1 : Y ⊇ R(f ) → X that exists on R(f ) iff If A = sat(A), then A is said to be saturated, and f is injective. Thus when f is bijective, f −1 (y) := B ⊆ R(f ) whenever f f − (B) = B. Thus for non- {x ∈ X: y = f (x)} exists for every y ∈ Y ; infact f is injective f , f − f is not an identity on X just as bijective iff f −1 ({y}) is a singleton for each y ∈ Y . f f − is not 1Y if f is not surjective. However the Non-injective functions are not at all rare; if any- set of relations thing, they are very common even for linear maps and it would be perhaps safe to conjecture that f f − f = f, f −f f − = f − (1) they are overwhelmingly predominant in the non- that is always true will be of basic significance in linear world of nature. Thus for example, the simple this work. Following are some equivalent statements 3 We do not distinguish between a relation and its graph although technically they are different objects. Thus although a functional relation, strictly speaking, is the triple (X, f, Y ) written traditionally as f : X → Y , we use it synonymously with the graph f itself. Parenthetically, the word functional in this paper is not necessarily employed for a scalar-valued function, but is used in a wider sense to distinguish between a function and an arbitrary relation (that is a multifunction). Formally, whereas an arbitrary relation from X to Y is a subset of X × Y , a functional relation must satisfy an additional restriction that requires y1 = y2 whenever (x, y1 ) ∈ f and (x, y2 ) ∈ f . In this subset notation, (x, y) ∈ f ⇔ y = f (x).
  • 5. Toward a Theory of Chaos 3151 on the injectivity and surjectivity of functions f : set of X under ∼, denoted by X/ ∼:= {[x]: x ∈ X}, X →Y. has the equivalence classes [x] as its elements; thus (Injec) f is 1:1 ⇔ there is a function f L : Y → X [x] plays a dual role either as subsets of X or as ele- called the left inverse of f , such that f L f = 1X ⇔ ments of X/ ∼. The rule x → [x] defines a surjective A = f − f (A) for all subsets A of X ⇔ f ( Ai ) = function Q: X → X/ ∼ known as the quotient map. f (Ai ). Example 1.1. Let (Surjec) f is onto ⇔ there is a function f R : Y → X called the right inverse of f , such that f f R = 1Y ⇔ S 1 = {(x, y) ∈ R2 ) : x2 + y 2 = 1} B = f f − (B) for all subsets B of Y . be the unit circle in R2 . Consider X = [0, 1] as a As we are primarily concerned with non- subspace of R, define a map injectivity of functions, saturated sets generated by equivalence classes of f will play a significant role q : X → S 1, s → (cos 2πs, sin 2πs), s ∈ X , in our discussions. A relation E on a set X is said from R to R2 , and let ∼ be the equivalence relation to be an equivalence relation if it is 4 on X (ER1) Reflexive: (∀x ∈ X)(xEx). s ∼ t ⇔ (s = t) ∨ (s = 0, t = 1) ∨ (s = 1, t = 0) . (ER2) Symmetric: (∀x, y ∈ X)(xEy ⇒ yEx). (ER3) Transitive: (∀x, y, z ∈ X)(xEy ∧ yEz ⇒ If we bend X around till its ends touch, the resulting xEz). circle represents the quotient set Y = X/ ∼ whose Equivalence relations group together unequal ele- points are equivalent under ∼ as follows ments x1 = x2 of a set as equivalent according to [0] = {0, 1} = [1], [s] = {s} for all s ∈ (0, 1) . the requirements of the relation. This is expressed as x1 ∼ x2 (mod E) and will be represented here by Thus q is bijective for s ∈ (0, 1) but two-to-one for the shorthand notation x1 ∼E x2 , or even simply the special values s = 0 and 1, so that for s, t ∈ X, as x1 ∼ x2 if the specification of E is not essential. s ∼ t ⇔ q(s) = q(t) . Thus for a non-injective map if f (x1 ) = f (x2 ) for x1 = x2 , then x1 and x2 can be considered to be This yields a bijection h: X/ ∼ → S 1 such that equivalent to each other since they map onto the same point under f ; thus x1 ∼f x2 ⇔ f (x1 ) = q =h◦Q f (x2 ) defines the equivalence relation ∼ f induced defines the quotient map Q: X → X/ ∼ by h([s]) = by the map f . Given an equivalence relation ∼ on q(s) for all s ∈ [0, 1]. The situation is illustrated by a set X and an element x ∈ X the subset the commutative diagram of Fig. 2 that appears as def [x] = {y ∈ X : y ∼ x} an integral component in a different and more gen- is called the equivalence class of x; thus x ∼ y ⇔ eral context in Sec. 2. It is to be noted that com- [x] = [y]. In particular, equivalence classes gener- mutativity of the diagram implies that if a given ated by f : X → Y , [x]f = {xα ∈ X : f (xα ) = equivalence relation ∼ on X is completely deter- f (x)}, will be a cornerstone of our analysis of chaos mined by q that associates the partitioning equiva- generated by the iterates of non-injective maps, and lence classes in X to unique points in S 1 , then ∼ is the equivalence relation ∼f := {(x, y): f (x) = f (y)} identical to the equivalence relation that is induced generated by f is uniquely defined by the partition by Q on X. Note that a larger size of the equivalence that f induces on X. Of course as x ∼ x, x ∈ [x]. classes can be obtained by considering X = R + for It is a simple matter to see that any two equiva- which s ∼ t ⇔ |s − t| ∈ Z+ . lence classes are either disjoint or equal so that the equivalence classes generated by an equivalence re- End Tutorial 1 lation on X form a disjoint cover of X. The quotient 4 An alternate useful way of expressing these properties for a relation R on X are (ER1) R is reflexive iff 1X ⊆ X (ER2) R is symmetric iff R = R−1 (ER3) R is transitive iff R ◦ R ⊆ R, with R an equivalence relation only if R ◦ R = R.
  • 6. 3152 A. Sengupta ¡ α∈D M(Aα ) and M α∈D Aα ⊆ α∈D M(Aα ) where D is an index set. The following illustrates the difference between the two inverses of M. Let   X be a set that is partitioned into two disjoint M- ¢ invariant subsets X1 and X2 . If x ∈ X1 (or x ∈ X2 ) then M(x) represents that part of X1 (or of X2 ) that is realized immediately after one application §¥¡ ¦ ¤ £ © ¨ of M, while M− (x) denotes the possible precursors of x in X1 (or of X2 ) and M+ (B) is that subset of X whose image lies in B for any subset B ⊂ X. Fig. 2. The quotient map Q. In this paper the multifunctions that we shall be explicitly concerned with arise as the inverses of non-injective maps. One of the central concepts that we consider and The second major component of our theory is employ in this work is the inverse f − of a nonlin- the graphical convergence of a net of functions to ear, non-injective, function f ; here the equivalence a multifunction. In Tutorial 2 below, we replace for classes [x]f = f − f (x) of x ∈ X are the saturated the sake of simplicity and without loss of generality, subsets of X that partition X. While a detailed the net (which is basically a sequence where the in- treatment of this question in the form of the non- dex set is not necessarily the positive integers; thus linear ill-posed problem and its solution is given in every sequence is a net but the family 5 indexed, for Sec. 2 [Sengupta, 1997], it is sufficient to point out example, by Z, the set of all integers, is a net and here from Figs. 1(c) and 1(d), that the inverse of a not a sequence) with a sequence and provide the non-injective function is not a function but a mul- necessary background and motivation for the con- tifunction while the inverse of a multifunction is a cept of graphical convergence. non-injective function. Hence one has the general result that f is a non-injective function ⇔ f − is a multifunction . Begin Tutorial 2: Convergence of (2) f is a multifunction Functions ⇔ f − is a non-injective function This Tutorial reviews the inadequacy of the usual notions of convergence of functions either to limit The inverse of a multifunction M: X –→ Y is a gen- → functions or to distributions and suggests the mo- eralization of the corresponding notion for a func- tivation and need for introduction of the notion tion f : X → Y such that of graphical convergence of functions to multifunc- def tions. Here, we follow closely the exposition of M− (y) = {x ∈ X : y ∈ M(x)} Korevaar [1968], and use the notation (f k )∞ to de- k=1 leads to note real or complex valued functions on a bounded or unbounded interval J. M− (B) = {x ∈ X : M(x) B = ∅} A sequence of piecewise continuous functions for any B ⊆ Y , while a more restricted inverse (fk )∞ is said to converge to the function f , nota- k=1 that we shall not be concerned with is given as tion fk → f , on a bounded or unbounded interval M+ (B) = {x ∈ X : M(x) ⊆ B}. Obviously, J6 M+ (B) ⊆ M− (B). A multifunction is injective if (1) Pointwise if x1 = x2 ⇒ M(x1 ) M(x2 ) = ∅, and commonly with functions, it is true that M α∈D Aα = fk (x) → f (x) for all x ∈ J , 5 A function χ: D → X will be called a family in X indexed by D when reference to the domain D is of interest, and a net when it is required to focus attention on its values in X. 6 Observe that it is not being claimed that f belongs to the same class as (fk ). This is the single most important cornerstone on which this paper is based: the need to “complete” spaces that are topologically “incomplete”. The classical high-school example of the related problem of having to enlarge, or extend, spaces that are not big enough is the solution space of algebraic equations with real coefficients like x2 + 1 = 0.
  • 7. Toward a Theory of Chaos 3153 i.e. Given any arbitrary real number ε 0 there It is to be observed that apart from point- exists a K ∈ N that may depend on x, such that wise and uniform convergences, all the other modes |fk (x) − f (x)| ε for all k ≥ K. listed above represent some sort of an averaged con- (2) Uniformly if tribution of the entire interval J and are therefore not of much use when pointwise behavior of the sup |f (x) − fk (x)| → 0 as k → ∞ , limit f is necessary. Thus while limits in the mean x∈J are not unique, oscillating functions are tamed by i.e. Given any arbitrary real number ε 0 there m-integral convergence for adequately large values exists a K ∈ N, such that supx∈J |fk (x) − f (x)| ε of m, and convergence relative to test functions, for all k ≥ K. as we see below, can be essentially reduced to m- (3) In the mean of order p ≥ 1 if |f (x) − f k (x)|p is integral convergence. On the contrary, our graphical integrable over J for each k convergence — which may be considered as a point- wise biconvergence with respect to both the direct |f (x) − fk (x)|p → 0 as k → ∞ . and inverse images of f just as usual pointwise con- J vergence is with respect to its direct image only For p = 1, this is the simple case of convergence in — allows a sequence (in fact, a net) of functions to the mean. converge to an arbitrary relation, unhindered by ex- (4) In the mean m-integrally if it is possible to select ternal influences such as the effects of integrations indefinite integrals and test functions. To see how this can indeed mat- x x1 ter, consider the following (−m) fk (x) = πk (x) + dx1 dx2 Example 1.2. Let fk (x) = sin kx, k = 1, 2, . . . and c c xm−1 let J be any bounded interval of the real line. Then ··· dxm fk (xm ) 1-integrally we have c x (−1) 1 1 and fk (x) = − cos kx = − + sin kx1 dx1 , k k 0 x x1 f (−m) (x) = π(x) + dx1 dx2 which obviously converges to 0 uniformly (and c c therefore in the mean) as k → ∞. And herein lies xm−1 the point: even though we cannot conclude about ··· dxm f (xm ) the exact nature of sin kx as k increases indefi- c nitely (except that its oscillations become more and such that for some arbitrary real p ≥ 1, more pronounced), we may very definitely state that (−m) p limk→∞(cos kx)/k = 0 uniformly. Hence from |f (−m) − fk | →0 as k → ∞. J x (−1) fk (x) → 0 = 0 + lim sin kx1 dx1 where the polynomials πk (x) and π(x) are of degree 0 k→∞ m, and c is a constant to be chosen appropriately. it follows that (5) Relative to test functions ϕ if f ϕ and f k ϕ are lim sin kx = 0 (3) integrable over J and k→∞ ∞ 1-integrally. (fk − f )ϕ → 0 for every ϕ ∈ C0 (J) as k → ∞ , Continuing with the same sequence of func- J tions, we now examine its test-functional conver- ∞ where C0 (J) is the class of infinitely differentiable 1 gence with respect to ϕ ∈ C0 (−∞, ∞) that vanishes continuous functions that vanish throughout some for all x ∈ (α, β). Integrating by parts, / neighborhood of each of the end points of J. For ∞ β an unbounded J, a function is said to vanish in fk ϕ = ϕ(x1 ) sin kx1 dx1 some neighborhood of +∞ if it vanishes on some −∞ α ray (r, ∞). 1 While pointwise convergence does not imply = − [ϕ(x1 ) cos kx1 ]β α k any other type of convergence, uniform conver- β gence on a bounded interval implies all the other 1 − ϕ (x1 ) cos kx1 dx1 convergences. k α
  • 8. 3154 A. Sengupta ©¦ £¦ ¨ § ¦ ( ) A¥£7 B@ 98 % 6 C ¤ § £¦ ' A¥£7 B@ 98 @ ¢ £¡ ¢ # ! $ 4 0 2 31 0 6 5   ¥  ¤   ¤ (a) (b) (c) Fig. 3. Incompleteness of function spaces. (a) demonstrates the classic example of non-completeness of the space of real- valued continuous functions leading to the complete spaces Ln [a, b] whose elements are equivalence classes of functions with b f ∼ g iff the Lebesgue integral a |f − g|n = 0. (b) and (c) illustrate distributional convergence of the functions fk (x) of Eq. (5) to the Dirac delta δ(x) leading to the complete space of generalized functions. In comparison, note that the space of continuous functions in the uniform metric C[a, b] is complete which suggests the importance of topologies in determining convergence properties of spaces. The first integrated term is 0 due to the condi- converges in the mean to f (−m) ϕ(m) so that tions on ϕ while the second also vanishes because β β 1 ϕ ∈ C0 (−∞, ∞). Hence (−m) (m) fk ϕ = (−1)m fk ϕ α α ∞ β fk ϕ → 0 = lim ϕ(x1 ) sin ksdx1 β β −∞ α k→∞ → (−1)m f (−m) ϕ(m) = f ϕ. α α for all ϕ, and leading to the conclusion that In fact the converse also holds leading to the following Equivalences between m-convergence in lim sin kx = 0 (4) k→∞ the mean and convergence with respect to test- functions [Korevaar, 1968]. test-functionally. Type 1 Equivalence. If f and (fk ) are functions This example illustrates the fact that if on J that are integrable on every interior subinter- Supp(ϕ) = [α, β] ⊆ J,7 integrating by parts suf- val, then the following are equivalent statements. ficiently large number of times so as to wipe out the pathological behavior of (fk ) gives (a) For every interior subinterval I of J there is an integer mI ≥ 0, and hence a smallest in- β fk ϕ = fk ϕ teger m ≥ 0, such that certain indefinite inte- (−m) J α grals fk of the functions fk converge in the β β mean on I to an indefinite integral f (−m) ; thus (−1) (−m) m = fk ϕ = · · · = (−1)m fk ϕ (−m) − f (−m) | → 0. α α I |fk ∞ (b) J (fk − f )ϕ → 0 for every ϕ ∈ C0 (J). (−m) x x x where fk (x) = πk (x) + c dx1 c 1 dx2 · · · c m−1 A significant generalization of this Equivalence is dxm fk (xm ) is an m-times arbitrary indefinite in- β (−m) obtained by dropping the restriction that the limit tegral of fk . If now it is true that α fk → object f be a function. The need for this gener- β (−m) (m) α f (−m) , then it must also be true that fk ϕ alization arises because metric function spaces are 7 ∞ By definition, the support (or supporting interval) of ϕ(x) ∈ C0 [α, β] is [α, β] if ϕ and all its derivatives vanish for x ≤ α and x ≥ β.
  • 9. Toward a Theory of Chaos 3155 known not to be complete: Consider the sequence can be associated with the arbitrary indefinite of functions [Fig. 3(a)] integrals   0, if a≤x≤0  a≤x≤0    0,  1   1   fk (x) = kx, if 0≤x≤  (5) def (−1) Θk (x) = δk (x) = kx, 0 x  k k 1   1    1, if ≤x≤b    1, ≤x≤b  k k which is not Cauchy in the uniform metric ρ(fj , fk ) = supa≤x≤b |fj (x) − fk (x)| but is Cauchy of Fig. 3(c), which, as noted above, converge b in the mean to the unit step function Θ(x); in the mean ρ(fj , fk ) = a |fj (x) − fk (x)|dx, or ∞ β β (−1) even pointwise. However in either case, (f k ) cannot hence −∞ δk ϕ ≡ α δk ϕ = − α δk ϕ → β converge in the respective metrics to a continuous − 0 ϕ (x)dx = ϕ(0). But there can be no func- function and the limit is a discontinuous unit step β tional relation δ(x) for which α δ(x)ϕ(x)dx = ϕ(0) function for all ϕ ∈ C0 1 [α, β], so that unlike in the case in 0, if a ≤ x ≤ 0 Type 1 Equivalence, the limit in the mean Θ(x) Θ(x) = (−1) 1, if 0 x ≤ b of the indefinite integrals δk (x) cannot be ex- pressed as the indefinite integral δ (−1) (x) of some with graph ([a, 0], 0) ((0, b], 1), which is also in- function δ(x) on any interval containing the ori- tegrable on [a, b]. Thus even if the limit of the se- gin. This leads to the second more general type of quence of continuous functions is not continuous, equivalence. both the limit and the members of the sequence are integrable functions. This Riemann integration Type 2 Equivalence. If (fk ) are functions on J is not sufficiently general, however, and this type that are integrable on every interior subinterval, of integrability needs to be replaced by a much then the following are equivalent statements. weaker condition resulting in the larger class of the Lebesgue integrable complete space of functions (a) For every interior subinterval I of J there is an L[a, b].8 integer mI ≥ 0, and hence a smallest integer The functions in Fig. 3(b), m ≥ 0, such that certain indefinite integrals (−m)  k, if 0 x 1   fk of the functions fk converge in the mean k on I to an integrable function Θ which, unlike  δk (x) = 1 in Type 1 Equivalence, need not itself be an  0, x ∈ [a, b] − 0, ,   k indefinite integral of some function f . 8 Both Riemann and Lebesgue integrals can be formulated in terms of the so-called step functions s(x), which are piecewise constant functions with values (σi )I on a finite number of bounded subintervals (Ji )I i=1 i=1 (which may reduce to a point or defI may not contain one or both of the end points) of a bounded or unbounded interval J, with integral J s(x)dx = i=1 σi |Ji |. While the Riemann integral of a bounded function f (x) on a bounded interval J is defined with respect to sequences of step functions (sj )∞ and (tj )∞ satisfying sj (x) ≤ f (x) ≤ tj (x) on J with J (sj − tj ) → 0 as j → ∞ as j=1 j=1 R J f (x)dx = lim J sj (x)dx = lim J tj (x)dx, the less restrictive Lebesgue integral is defined for arbitrary functions f over bounded or unbounded intervals J in terms of Cauchy sequences of step functions J |si − sk | → 0, i, k → ∞, converging to f (x) as sj (x) → f (x) pointwise almost everywhere on J , to be def f (x)dx = lim sj (x)dx . J j→∞ J That the Lebesgue integral is more general (and therefore is the proper candidate for completion of function spaces) is illustrated by the example of the function defined over [0, 1] to be 0 on the rationals and 1 on the irrationals for which an application of the definitions verify that while the Riemann integral is undefined, the Lebesgue integral exists and has value 1. The Riemann integral of a bounded function over a bounded interval exists and is equal to its Lebesgue integral. Because it involves a larger family of functions, all integrals in integral convergences are to be understood in the Lebesgue sense.
  • 10. 3156 A. Sengupta (b) ck (ϕ) = ∞ fk ϕ → c(ϕ) for every ϕ ∈ C0 (J). system evolves to a state of maximal ill-posedness. J The analysis is based on the non-injectivity, and (−m) Since we are now given that I fk (x)dx → hence ill-posedness, of the map; this may be viewed (−m) (m) as a mathematical formulation of the stretch-and- I Ψ(x)dx, it must also be true that fk ϕ con- verges in the mean to Ψϕ(m) whence fold and stretch-cut-and-paste kneading operations of the dough that are well-established artifacts in (−m) (m) the theory of chaos and the concept of maximal ill- fk ϕ = (−1)m fk ϕ J I posedness helps in obtaining a physical understand- ing of the nature of chaos. We do this through the → (−1)m Ψϕ(m) = (−1)m f (−m) ϕ(m) . fundamental concept of the graphical convergence of I I a sequence (generally a net) of functions [Sengupta The natural question that arises at this stage is Ray, 2000] that is allowed to converge graphically, then: What is the nature of the relation (not func- when the conditions are right, to a set-valued map tion any more) Ψ(x)? For this it is now stipulated, or multifunction. Since ill-posed problems naturally despite the non-equality in the equation above, that lead to multifunctional inverses through functional as in the mean m-integral convergence of (f k ) to a generalized inverses [Sengupta, 1997], it is natural function f , to seek solutions of ill-posed problems in multifunc- x (−1) def tional space Multi(X, Y ) rather than in spaces of Θ(x) := lim δk (x) = δ(x )dx (6) functions Map(X, Y ); here Multi(X, Y ) is an ex- k→∞ −∞ tension of Map(X, Y ) that is generally larger than defines the non-functional relation (“generalized the smallest dense extension Multi | (X, Y ). function”) δ(x) integrally as a solution of the inte- Feedback and iteration are natural processes by gral equation (6) of the first kind; hence formally 9 which nature evolves itself. Thus almost every pro- dΘ cess of evolution is a self-correction process by which δ(x) = (7) dx the system proceeds from the present to the future through a controlled mechanism of input and eval- End Tutorial 2 uation of the past. Evolution laws are inherently nonlinear and complex; here complexity is to be un- derstood as the natural manifestation of the non- The above tells us that the “delta function” is not linear laws that govern the evolution of the system. a function but its indefinite integral is the piecewise This paper presents a mathematical description continuous function Θ obtained as the mean (or of complexity based on [Sengupta, 1997] and [Sen- pointwise) limit of a sequence of non-differentiable gupta Ray, 2000] and is organized as follows. functions with the integral of dΘk (x)/dx being pre- In Sec. 1, we follow [Sengupta, 1997] to give an served for all k ∈ Z+ . What then is the delta overview of ill-posed problems and their solution (and not its integral)? The answer to this ques- that forms the foundation of our approach. Sec- tion is contained in our multifunctional extension tions 2 to 4 apply these ideas by defining a chaotic Multi(X, Y ) of the function space Map(X, Y ) con- dynamical system as a maximally ill-posed problem; sidered in Sec. 3. Our treatment of ill-posed prob- by doing this we are able to overcome the limi- lems is used to obtain an understanding and inter- tations of the three Devaney characterizations of pretation of the numerical results of the discretized chaos [Devaney, 1989] that apply to the specific case spectral approximation in neutron transport the- of iteration of transformations in a metric space, ory [Sengupta, 1988, 1995]. The main conclusions and the resulting graphical convergence of func- are the following: In a one-dimensional discrete sys- tions to multifunctions is the basic tool of our ap- tem that is governed by the iterates of a nonlin- proach. Section 5 analyzes graphical convergence in ear map, the dynamics is chaotic if and only if the Multi(X) for the discretized spectral approximation 9 The observant reader cannot have failed to notice how mathematical ingenuity successfully transferred the “troubles” of ∞ (δk )k=1 to the sufficiently differentiable benevolent receptor ϕ so as to be able to work backward, via the resultant trouble free (−m) (δk )∞ , to the final object δ. This necessarily hides the true character of δ to allow only a view of its integral manifestation k=1 on functions. This unfortunately is not general enough in the strongly nonlinear physical situations responsible for chaos, and is the main reason for constructing the multifunctional extension of function spaces that we use.
  • 11. Toward a Theory of Chaos 3157 of neutron transport theory, which suggests a nat- Example 2.1. As a non-trivial example of an in- ural link between ill-posed problems and spectral verse problem, consider the heat equation theory of nonlinear operators. This seems to offer an answer to the question of why a natural sys- ∂θ(x, t) ∂ 2 θ(x, t) = c2 tem should increase its complexity, and eventually ∂t ∂x2 tend toward chaoticity, by becoming increasingly for the temperature distribution θ(x, t) of a one- nonlinear. dimensional homogeneous rod of length L satisfy- ing the initial condition θ(x, 0) = θ 0 (x), 0 ≤ x ≤ L, 2. Ill-Posed Problem and Its and boundary conditions θ(0, t) = 0 = θ(L, t), 0 ≤ Solution t ≤ T , having the Fourier sine-series solution This section based on [Sengupta, 1997] presents ∞ nπ 2 a formulation and solution of ill-posed problems θ(x, t) = An sin x e−λn t (8) L arising out of the non-injectivity of a function f : n=1 X → Y between topological spaces X and Y . A where λn = (cπ/a)n and workable knowledge of this approach is necessary as our theory of chaos leading to the characterization a 2 nπ of chaotic systems as being a maximally ill-posed An = θ0 (x ) sin x dx L 0 L state of a dynamical system is a direct application of these ideas and can be taken to constitute a math- are the Fourier expansion coefficients. While the di- ematical representation of the familiar stretch-cut- rect problem evaluates θ(x, t) from the differential and paste and stretch-and-fold paradigms of chaos. equation and initial temperature distribution θ 0 (x), The problem of finding an x ∈ X for a given y ∈ Y the inverse problem calculates θ0 (x) from the inte- from the functional relation f (x) = y is an inverse gral equation problem that is ill-posed (or, the equation f (x) = y 2 a is ill-posed) if any one or more of the following con- θT (x) = k(x, x )θ0 (x )dx , 0 ≤ x ≤ L, L 0 ditions are satisfied. when this final temperature θT is known, and (IP1) f is not injective. This non-uniqueness prob- lem of the solution for a given y is the single most ∞ nπ nπ 2 significant criterion of ill-posedness used in this k(x, x ) = sin x sin x e−λn T L L work. n=1 (IP2) f is not surjective. For a y ∈ Y , this is the is the kernel of the integral equation. In terms of existence problem of the given equation. the final temperature the distribution becomes (IP3) When f is bijective, the inverse f −1 is not ∞ continuous, which means that small changes in y nπ 2 θT (x) = Bn sin x e−λn (t−T ) (9) may lead to large changes in x. L n=1 A problem f (x) = y for which a solution exists, with Fourier coefficients is unique, and small changes in data y that lead 2 a nπ to only small changes in the solution x is said to Bn = θT (x ) sin x dx . L 0 L be well-posed or properly posed. This means that f (x) = y is well-posed if f is bijective and the In L2 [0, a], Eqs. (8) and (9) at t = T and t = 0 inverse f −1 : Y → X is continuous; otherwise the yield respectively equation is ill-posed or improperly posed. It is to ∞ be noted that the three criteria are not, in general, L 2 2 θT (x) 2 = A2 e−2λn T ≤ e−2λ1 T θ0 n 2 (10) independent of each other. Thus if f represents a 2 n=1 bijective, bounded linear operator between Banach ∞ spaces X and Y , then the inverse mapping theo- 2 L 2 θ0 = Bn e2λn T . 2 (11) rem guarantees that the inverse f −1 is continuous. 2 n=1 Hence ill-posedness depends not only on the alge- braic structures of X, Y , f but also on the topolo- The last two equations differ from each other in gies of X and Y . the significant respect that whereas Eq. (10) shows
  • 12. 3158 A. Sengupta that the direct problem is well-posed according to (b) For a linear operator A: Rn → Rm , m n, sat- (IP3), Eq. (11) means that in the absence of similar isfying (1) and (2), the problem Ax = y reduces A bounds the inverse problem is ill-posed. 10 to echelon form with rank r less than min{m, n}, when the given equations are consistent. The solu- tion however, produces a generalized inverse leading Example 2.2. Consider the Volterra integral equa- to a set-valued inverse A− of A for which the inverse tion of the first kind images of y ∈ R(A) are multivalued because of the x non-trivial null space of A introduced by assump- y(x) = r(x )dx = Kr tion (1). Specifically, a null-space of dimension n−r a n is generated by the free variables {x j }j=r+1 which are arbitrary: this is illposedness of type (1). In ad- where y, r ∈ C[a, b] and K: C[0, 1] → C[0, 1] is dition, m − r rows of the row reduced echelon form the corresponding integral operator. Since the dif- of A have all 0 entries that introduce restrictions ferential operator D = d/dx under the sup-norm m on m − r coordinates {yi }i=r+1 of y which are now r = sup0≤x≤1 |r(x)| is unbounded, the inverse r related to {yi }i=1 : this illustrates ill-posedness of problem r = Dy for a differentiable function y type (2). Inverse ill-posed problems therefore gen- on [a, b] is ill-posed, see Example 6.1. However, erate multivalued solutions through a generalized y = Kr becomes well-posed if y is considered to be inverse of the mapping. in C 1 [0, 1] with norm y = sup0≤x≤1 |Dy|. This il- (c) The eigenvalue problem lustrates the importance of the topologies of X and Y in determining the ill-posed nature of the prob- d2 lem when this is due to (IP3). + λ2 y = 0 y(0) = 0 = y(1) dx2 Ill-posed problems in nonlinear mathematics of type (IP1) arising from the non-injectivity of f has the following equivalence class of 0 can be considered to be a generalization of non- d2 uniqueness of solutions of linear equations as, for [0]D2 = {sin(πmx)}∞ , m=0 D2 = + λ2 , example, in eigenvalue problems or in the solution of dx2 a system of linear algebraic equations with a larger as its eigenfunctions corresponding to the eigenval- number of unknowns than the number of equations. ues λm = πm. In both cases, for a given y ∈ Y , the solution set of Ill-posed problems are primarily of interest to the equation f (x) = y is given by us explicitly as non-injective maps f , that is under f − (y) = [x]f = {x ∈ X : f (x ) = f (x) = y} . the condition of (IP1). The two other conditions (IP2) and (IP3) are not as significant and play only A significant point of difference between linear and an implicit role in the theory. In its application to nonlinear problems is that unlike the special im- iterative systems, the degree of non-injectivity of f portance of 0 in linear mathematics, there are no defined as the number of its injective branches, in- preferred elements in nonlinear problems; this leads creases with iteration of the map. A necessary (but to a shift of emphasis from the null space of linear not sufficient) condition for chaos to occur is the problems to equivalence classes for nonlinear equa- increasing non-injectivity of f that is expressed de- tions. To motivate the role of equivalence classes, scriptively in the chaos literature as stretch-and-fold let us consider the null spaces in the following lin- or stretch-cut-and-paste operations. This increasing ear problems. non-injectivity that we discuss in the following sec- (a) Let f : R2 → R be defined by f (x, y) = x + y, tions, is what causes a dynamical system to tend (x, y) ∈ R2 . The null space of f is generated by the toward chaoticity. Ill-posedness arising from non- equation y = −x on the x–y plane, and the graph surjectivity of (injective) f in the form of regular- of f is the plane passing through the lines ρ = x ization [Tikhonov Arsenin, 1977] has received and ρ = y. For each ρ ∈ R the equivalence classes wide attention in the literature of ill-posed prob- f − (ρ) = {(x, y) ∈ R2 : x + y = ρ} are lines on the lems; this however is not of much significance in graph parallel to the null set. our work. 10 Recall that for a linear operator continuity and boundedness are equivalent concepts.
  • 13. Toward a Theory of Chaos 3159 %¨§  # ¡$ ¡ ¨§  # P ! 3 5) B 6 @ ¡ £   £ 6 GF 8@ ¡ £   © £ 3 5 I1 ) ¡ ¥£   © ¤ 8 HF 921ED 3 ) ¤ ¥£ 4210( 3 ) ¡ ©¨§  ¦ '¨§  ¦ ¡$ A C ¡¢  8 95 6 75 (a) (b) Fig. 4. (a) Moore–Penrose generalized inverse. The decomposition of X and Y into the four fundamental subspaces of A comprising the null space N (A), the column (or range) space R(A), the row space R(AT ) and N (AT ), the complement of R(A) in Y , is a basic result in the theory of linear equations. The Moore–Penrose inverse takes advantage of the geometric orthogonality of the row space R(AT ) and N (A) in Rn and that of the column space and N (AT ) in Rm . (b) When X and Y are not inner-product spaces, a non-injective inverse can be defined by extending f to Y − R(f ) suitably as shown by the dashed curve, where g(x) := r1 + ((r2 − r1 )/r1 )f (x) for all x ∈ D(f ) was taken to be a good definition of an extension that replicates f in Y − R(f ); here x1 ∼ x2 under both f and g, and y1 ∼ y2 under {f, g} just as b is equivalent to b in the Moore–Penrose case. Note that both {f, g} and {f − , g − } are both multifunctions on X and Y , respectively. Our inverse G, introduced later in this section, is however injective with G(Y − R(f )) := 0. map a) is the noninjective map defined in terms of the row and column spaces of A, row(A) = R(A T ), Begin Tutorial 3: Generalized col(A) = R(A), as Inverse In this Tutorial, we take a quick look at the equation def (a|row(A) )−1 (y), if y ∈ col(A) a(x) = y, where a: X → Y is a linear map that need GMP (y) = 0, if y ∈ N (AT ) . not be either one-one or onto. Specifically, we will take X and Y to be the Euclidean spaces R n and (12) Rm so that a has a matrix representation A ∈ R m×n where Rm×n is the collection of m×n matrices with Note that the restriction a|row(A) of a to R(AT ) real entries. The inverse A−1 exists and is unique iff is bijective so that the inverse (a| row(A) )−1 is well- m = n and rank(A) = n; this is the situation de- defined. The role of the transpose matrix appears picted in Fig. 1(a). If A is neither one-one or onto, naturally, and the GMP of Eq. (12) is the unique then we need to consider the multifunction A − , a matrix that satisfies the conditions functional choice of which is known as the general- ized inverse G of A. A good introductory text for AGMP A = A, GMP AGMP = GMP , (13) generalized inverses is [Campbell Mayer, 1979]. (GMP A)T = GMP A, (AGMP )T = AGMP Figure 4(a) introduces the following definition of the Moore–Penrose generalized inverse G MP . that follow immediately from the definition (12); hence GMP A and AGMP are orthogonal projec- Definition 2.1 (Moore–Penrose Inverse). If a: tions11 onto the subspaces R(AT ) = R(GMP ) and Rn → Rm is a linear transformation with matrix R(A), respectively. Recall that the range space representation A ∈ Rm×n then the Moore–Penrose R(AT ) of AT is the same as the row space row(A) inverse GMP ∈ Rn×m of A (we will use the same of A, and R(A) is also known as the column space notation GMP : Rm → Rn for the inverse of the of A, col(A). 11 A real matrix A is an orthogonal projector iff A2 = A and A = AT .
  • 14. 3160 A. Sengupta Example 2.3. For a: R5 → R4 , let rank is 4. This gives 9 1 18 2   1  −3 2 1 2  − −    275 275 275 55  3 −9 10 2 9   − 27 3 54 6  A=   −  2 −6 4 2 4 275 275 275 55      2 −6 8 1 7  10 6 20 16  GMP =  − −    143 143 143 143   238 57 476 59  By reducing the augmented matrix (A|y) to the  − −  3575 3575 3575 715    row-reduced echelon form, it can be verified that   129 106 258 47  the null and range spaces of A are three- and two- − − dimensional, respectively. A basis for the null space 3575 3575 3575 715 (14) of AT and of the row and column space of A ob- tained from the echelon form are respectively as the Moore–Penrose inverse of A that readily ver- ifies all the four conditions of Eqs. (13). The basic     point here is that, as in the case of a bijective map, 1 0  −3   0   GMP A and AGMP are identities on the row and col- −2        1         1 0 umn spaces of A that define its rank. For later use —  0   −1   0   1   0 1  ,    ; and  3   , 1   ;   ,  .    when we return to this example for a simpler inverse  1  0   2 0    2  −   4  G — given below are the orthonormal bases of the 0 1   1     3   −1 1 four fundamental subspaces with respect to which 2 4 GMP is a representation of the generalized inverse of A; these calculations were done by MATLAB. The basis for According to its definition Eq. (12), the Moore– Penrose inverse maps the middle two of the above (a) the column space of A consists of the first two set to (0, 0, 0, 0, 0)T , and the A-image of the first columns of the eigenvectors of AAT : two (which are respectively (19, 70, 38, 51) T and T (70, 275, 140, 205)T lying, as they must, in the span 1633 363 3317 363 − ,− , , of the last two), to the span of (1, −3, 2, 1, 2) T and 2585 892 6387 892 (3, −9, 10, 2, 9)T because a restricted to this sub- T 929 709 346 709 space of R5 is bijective. Hence − , , ,− 1435 1319 6299 1319       1 0 (b) the null space of AT consists of the last two   −3   0   −2  columns of the eigenvectors of AAT :     1   0  1 0       −1  T 3185 293 3185 1777 GMP A  3  A  1  − ,−        1 , , 0    2  −4  8306 2493 4153 3547        1  3 0 1 T       323 533 323 1037 , , , 2 4 1732 731 866 1911   1 0 0 0 (c) the row space of A consists of the first two  −3  0 0 0  columns of the eigenvectors of AT A:  0 1 0 0   421 44 569 659 1036 = 3  1 .  , ,− ,− , 13823 14895 918 2526 1401  2 −4 0 0      661 412 59 1523 303  1 3  , , ,− ,− 0 0 690 1775 2960 10221 3974 2 4 (d) the null space of A consists of the last three The second matrix on the left is invertible as its columns of AT A:
  • 15. Toward a Theory of Chaos 3161 571 369 149 291 389 (T3) Arbitrary unions of members of U belong − ,− , ,− ,− to U. 15469 776 25344 350 1365 281 956 875 1279 409 Example 2.4 − , , ,− , 1313 1489 1706 2847 1473 (1) The smallest topology possible on a set X is 292 876 203 621 1157 its indiscrete topology when the only open sets ,− , , , 1579 1579 342 4814 2152 are ∅ and X; the largest is the discrete topology The matrices Q1 and Q2 with these eigenvectors where every subset of X is open (and hence also (xi ) satisfying xi = 1 and (xi , xj ) = 0 for i = j closed). as their columns are orthogonal matrices with the (2) In a metric space (X, d), let Bε (x, d) = {y ∈ X: simple inverse criterion Q−1 = QT . d(x, y) ε} be an open ball at x. Any subset U of X such that for each x ∈ U there is a d- ball Bε (x, d) ⊆ U in U , is said to be an open End Tutorial 3 set of (X, d). The collection of all these sets is the topology induced by d. The topological space (X, U) is then said to be associated with The basic issue in the solution of the inverse ill- (induced by) (X, d). posed problem is its reduction to an well-posed one (3) If ∼ is an equivalence relation on a set X, the when restricted to suitable subspaces of the do- set of all saturated sets [x]∼ = {y ∈ X: y ∼ x} main and range of A. Considerations of geometry is a topology on X; this topology is called the leading to their decomposition into orthogonal sub- topology of saturated sets. spaces is only an additional feature that is not cen- We argue in Sec. 4.2 that this constitutes tral to the problem: recall from Eq. (1) that any the defining topology of a chaotic system. function f must necessarily satisfy the more general (4) For any subset A of the set X, the A-inclusion set-theoretic relations f f −f = f and f − f f − = f − topology on X consists of ∅ and every superset of Eq. (13) for the multiinverse f − of f : X → Y . of A, while the A-exclusion topology on X con- The second distinguishing feature of the MP-inverse sists of all subsets of X − A. Thus A is open is that it is defined, by a suitable extension, on all in the inclusion topology and closed in the ex- of Y and not just on f (X) which is perhaps more clusion, and in general every open set of one is natural. The availability of orthogonality in inner- closed in the other. product spaces allows this extension to be made The special cases of the a-inclusion and a- in an almost normal fashion. As we shall see be- exclusion topologies for A = {a} are defined in low the additional geometric restriction of Eq. (13) a similar fashion. is not essential to the solution process, and in- (5) The cofinite and cocountable topologies in which fact, only results in a less canonical form of the the open sets of an infinite (resp. uncount- inverse. able) set X are respectively the complements of finite and countable subsets, are examples of topologies with some unusual properties that are covered in Appendix A.1. If X is itself finite (respectively, countable), then its cofinite Begin Tutorial 4: Topological Spaces (respectively, cocountable) topology is the dis- This Tutorial is meant to familiarize the reader with crete topology consisting of all its subsets. It is the basic principles of a topological space. A topo- therefore useful to adopt the convention, unless logical space (X, U) is a set X with a class 12 U of stated to the contrary, that cofinite and co- distinguished subsets, called open sets of X, that countable spaces are respectively infinite and satisfy uncountable. (T1) The empty set ∅ and the whole X belong to U In the space (X, U), a neighborhood of a point (T2) Finite intersections of members of U belong x ∈ X is a nonempty subset N of X that con- to U tains an open set U containing x; thus N ⊆ X is a 12 In this sense, a class is a set of sets.
  • 16. 3162 A. Sengupta neighborhood of x iff neighborhood system at x coincides exactly with x∈U ⊆N (15) the assigned collection Nx ; compare with Defini- tion A.1.1. Neighborhoods in topological spaces are for some U ∈ U. The largest open set that can be a generalization of the familiar notion of distances used here is Int(N ) (where, by definition, Int(A) is of metric spaces that quantifies “closeness” of points the largest open set that is contained in A) so that of X. the above neighborhood criterion for a subset N of A neighborhood of a non-empty subset A of X X can be expressed in the equivalent form that will be needed later on is defined in a similar N ⊆ X is a U − neighborhood of x iff x ∈ Int U (N ) manner: N is a neighborhood of A iff A ⊆ Int(N ), (16) that is A ⊆ U ⊆ N ; thus the neighborhood sys- tem at A is given by NA = a∈A Na := {G ⊆ X: implying that a subset of (X, U) is a neighborhood G ∈ Na for every a ∈ A} is the class of common of all its interior points, so that N ∈ N x ⇒ N ∈ Ny neighborhoods of each point of A. for all y ∈ Int(N ). The collection of all neighbor- Some examples of neighborhood systems at a hoods of x point x in X are the following: def Nx = {N ⊆ X : x ∈ U ⊆ N for some U ∈ U} (1) In an indiscrete space (X, U), X is the only (17) neighborhood of every point of the space; in a is the neighborhood system at x, and the subcol- discrete space any set containing x is a neigh- lection U of the topology used in this equation borhood of the point. constitutes a neighborhood (local ) base or basic (2) In an infinite cofinite (or uncountable cocount- neighborhood system, at x, see Definition A.1.1 of able) space, every neighborhood of a point is an Appendix A.1. The properties open neighborhood of that point. (3) In the topology of saturated sets under the (N1) x belongs to every member N of Nx , equivalence relation ∼, the neighborhood sys- (N2) The intersection of any two neighborhoods of tem at x consists of all supersets of the equiva- x is another neighborhood of x: N, M ∈ N x ⇒ lence class [x]∼ . N M ∈ Nx , (4) Let x ∈ X. In the x-inclusion topology, N x (N3) Every superset of any neighborhood of x is a consists of all the non-empty open sets of X neighborhood of x: (M ∈ Nx ) ∧ (M ⊆ N ) ⇒ N ∈ which are the supersets of {x}. For a point Nx , y = x of X, Ny are the supersets of {x, y}. that characterize Nx completely are a direct conse- For any given class T S of subsets of X, a unique quence of the definitions (15), (16) that may also topology U(T S) can always be constructed on X be stated as by taking all finite intersections T S∧ of members of S followed by arbitrary unions T S∧∨ of these fi- (N0) Any neighborhood N ∈ Nx contains another nite intersections. U(T S) := T S∧∨ is the smallest neighborhood U of x that is a neighborhood of each topology on X that contains T S and is said to be of its points: ((∀N ∈ Nx )(∃U ∈ Nx )(U ⊆ N )) : generated by T S. For a given topology U on X satis- (∀y ∈ U ⇒ U ∈ Ny ). fying U = U(T S), T S is a subbasis, and T S∧ := T B Property (N0) infact serves as the defining char- a basis, for the topology U; for more on topological acteristic of an open set, and U can be identified basis, see Appendix A.1. The topology generated with the largest open set Int(N ) contained in N ; by a subbase essentially builds not from the collec- hence a set G in a topological space is open iff it is tion T S itself but from the finite intersections T S∧ a neighborhood of each of its points. Accordingly if of its subsets; in comparison the base generates a Nx is a given class of subsets of X associated with topology directly from a collection T S of subsets each x ∈ X satisfying (N1)–(N3), then (N0) defines by forming their unions. Thus whereas any class of the special class of neighborhoods G subsets can be used as a subbasis, a given collection U = {G ∈ Nx : x ∈ B ⊆ G for all x ∈ G must meet certain qualifications to pass the test of a base for a topology: these and related topics are cov- and some basic nbd B ∈ Nx } (18) ered in Appendix A.1. Different subbases, therefore, as the unique topology on X that contains a basic can be used to generate different topologies on the neighborhood of each of its points, for which the same set X as the following examples for the case of
  • 17. Toward a Theory of Chaos 3163 X = R demonstrates; here (a, b), [a, b), (a, b] and consisting of those points of X that are in A but [a, b], for a ≤ b ∈ R, are the usual open-closed inter- not in its boundary, Int(A) = A − Bdy(A), is the vals in R.13 The subbases T S1 = {(a, ∞), (−∞, b)}, largest open subset of X that is contained in A. T S2 = {[a, ∞), (−∞, b)}, T S3 = {(a, ∞), (−∞, b]} Hence it follows that Int(Bdy(A)) = ∅, the bound- and T S4 = {[a, ∞), (−∞, b]} give the respective ary of A is the intersection of the closures of A and bases T B1 = {(a, b)}, T B2 = {[a, b)}, T B3 = {(a, b]} X − A, and a subset N of X is a neighborhood of and T B4 = {[a, b]}, a ≤ b ∈ R, leading to the stan- x iff x ∈ Int(N ). dard (usual ), lower limit (Sorgenfrey), upper limit, and discrete (take a = b) topologies on R. Bases of The three subsets Int(A), Bdy(A) and exterior the type (a, ∞) and (−∞, b) provide the right and of A defined as Ext(A) := Int(X − A) = X − Cl(A), left ray topologies on R. are pairwise disjoint and have the full space X as their union. This feasibility of generating different topologies on a set can be of great practi- Definition 2.3 (Derived and Isolated sets). Let A cal significance because open sets determine be a subset of X. A point x ∈ X (which may or convergence characteristics of nets and con- may not be a point of A) is a cluster point of A if tinuity characteristics of functions, thereby every neighborhood N ∈ Nx contains at least one making it possible for nature to play around point of A different from x. The derived set of A with the structure of its working space in its def kitchen to its best possible advantage. 14 Der(A) = x ∈ X : (∀N ∈ Nx ) N (A−{x}) = ∅ Here are a few essential concepts and terminology (22) for topological spaces. is the set of all cluster points of A. The complement of Der(A) in A Definition 2.2 (Boundary, Closure, Interior). The def boundary of A in X is the set of points x ∈ X such Iso(A) = A − Der(A) = Cl(A) − Der(A) (23) that every neighborhood N of x intersects both A and X–A: are the isolated points of A to which no proper sequence in A converges, that is there exists a neigh- def Bdy(A) = {x ∈ X : (∀N ∈ Nx )((N A=∅ borhood of any such point that contains no other point of A so that the only sequence that converges ∧(N (X − A) = ∅))} (19) to a ∈ Iso(A) is the constant sequence (a, a, a, . . .). Clearly, where Nx is the neighborhood system of Eq. (17) at x. Cl(A) = A Der(A) = A Bdy(A) The closure of A is the set of all points x ∈ X such that each neighborhood of x contains at least = Iso(A) Der(A) = Int(A) Bdy(A) one point of A that may be x itself. Thus the set with the last two being disjoint unions, and A is def Cl(A) = {x ∈ X : (∀N ∈ Nx )(N A = ∅)} (20) closed iff A contains all its cluster points, Der(A) ⊆ A, iff A contains its closure. Hence of all points in X adherent to A is given by the A = Cl(A) ⇔ Cl(A) union of A with its boundary. The interior of A = {x ∈ A : ((∃N ∈ Nx )(N ⊆ A)) def Int(A) = {x ∈ X : (∃N ∈ Nx )(N ⊆ A)} (21) ∨((∀N ∈ Nx )(N (X − A) = ∅))} . 13 By definition, an interval I in a totally ordered set X is a subset of X with the property (x1 , x2 ∈ I) ∧ (x3 ∈ X : x1 x3 x2 ) ⇒ x3 ∈ I so that any element of X lying between two elements of I also belongs to I. 14 Although we do not pursue this point of view here, it is nonetheless tempting to speculate that the answer to the question “Why does the entropy of an isolated system increase?” may be found by exploiting this line of reasoning that seeks to explain the increase in terms of a visible component associated with the usual topology as against a different latent workplace topology that governs the dynamics of nature.
  • 18. 3164 A. Sengupta Comparison of Eqs. (19) and (22) also makes it (g) Cl(A) = {F ⊆ X : F clear that Bdy(A) ⊆ Der(A). The special case of is a closed set of X containing A} A = Iso(A) with Der(A) ⊆ X − A is important (25) enough to deserve a special mention: A straightforward consequence of property (b) Definition 2.4 (Donor set). A proper, nonempty is that the boundary of any subset A of a topolog- subset A of X such that Iso(A) = A with Der(A) ⊆ ical space X is closed in X; this significant result X − A will be called self-isolated or donor. Thus se- may also be demonstrated as follows. If x ∈ X is not quences eventually in a donor set converges only in the boundary of A there is some neighborhood in its complement; this is, the opposite of the N of x that does not intersect both A and X − A. characteristic of a closed set where all converging For each point y ∈ N , N is a neighborhood of that sequences eventually in the set must necessarily point that does not meet A and X − A simultane- converge in it. A closed-donor set with a closed ously so that N is contained wholly in X − Bdy(A). neighbor has no derived or boundary sets, and will We may now take N to be open without any loss of be said to be isolated in X. generality implying thereby that X − Bdy(A) is an open set of X from which it follows that Bdy(A) is Example 2.5. In an isolated set sequences con- closed in X. Further material on topological spaces relevant verge, if they have to, simultaneously in the com- to our work can be found in Appendix A.3. plement (because it is donor) and in it (because it is closed). Convergent sequences in such a set can only End Tutorial 4 be constant sequences. Physically, if we consider ad- herents to be contributions made by the dynamics of the corresponding sequences, then an isolated set is Working in a general topological space, we now re- secluded from its neighbor in the sense that it nei- call the solution of an ill-posed problem f (x) = y ther receives any contributions from its surround- [Sengupta, 1997] that leads to a multifunctional in- ings, nor does it give away any. In this light and verse f − through the generalized inverse G. Let terminology, a closed set is a selfish set (recall that f : (X, U) → (Y, V) be a (nonlinear) function be- a set A is closed in X iff every convergent net of X tween two topological space (X, U) and (Y, V) that that is eventually in A converges in A; conversely a is neither one-one or onto. Since f is not one- set is open in X iff the only nets that converge in A one, X can be partitioned into disjoint equiva- are eventually in it), whereas a set with a derived lence classes with respect to the equivalence relation set that intersects itself and its complement may be x1 ∼ x2 ⇔ f (x1 ) = f (x2 ). Picking a representative considered to be neutral. Appendix A.3 shows the member from each of the classes (this is possible various possibilities for the derived set and bound- by the Axiom of Choice; see the following Tuto- ary of a subset A of X. rial) produces a basic set XB of X; it is basic as it corresponds to the row space in the linear matrix Some useful properties of these concepts for example which is all that is needed for taking an a subset A of a topological space X are the inverse. XB is the counterpart of the quotient set X/ ∼ of Sec. 1, with the important difference that following. whereas the points of the quotient set are the equiv- (a) BdyX (X) = ∅, alence classes of X, XB is a subset of X with each (b) Bdy(A) = Cl(A) Cl(X − A), of the classes contributing a point to X B . It then follows that fB : XB → f (X) is the bijective re- (c) Int(A) = X − Cl(X − A) = A − Bdy(A) = striction a|row(A) that reduces the original ill-posed Cl(A) − Bdy(A), problem to a well-posed one with XB and f (X) (d) Int(A) Bdy(A) = ∅, corresponding respectively to the row and column (e) X = Int(A) Bdy(A) Int(X − A), −1 spaces of A, and fB : f (X) → XB is the ba- (f) Int(A) = {G ⊆ X : G sic inverse from which the multiinverse f − is ob- tained through G, which in turn corresponds to the is an open set of X contained in A} Moore–Penrose inverse GMP . The topological con- (24) siderations (obviously not for inner product spaces
  • 19. Toward a Theory of Chaos 3165 that applies to the Moore–Penrose inverse) needed of the choice of the single element π from the re- to complete the solution are discussed below and in als. To see this more closely in the context of maps Appendix A.1. that we are concerned with, let f : X → Y be a non-injective, onto map. To construct a functional right inverse fr : Y → X of f , we must choose, for each y ∈ Y one representative element x rep from Begin Tutorial 5: Axiom of Choice the set f − (y) and define fr (y) to be that element and Zorn’s Lemma according to f ◦ fr (y) = f (xrep ) = y. If there is Since some of our basic arguments depend on it, no preferred or natural way to make this choice, this Tutorial contains a short description of the the axiom of choice allows us to make an arbitrary Axiom of Choice that has been described as “one selection from the infinitely many that may be pos- of the most important, and at the same time one sible from f − (y). When a natural choice is indeed of the most controversial, principles of mathemat- available, as for example in the case of the initial ics”. What this axiom states is this: For any set X value problem y (x) = x; y(0) = α0 on [0, a], the there exists a function fC : P0 (X) → X such that definite solution α0 +x2 /2 may be selected from the x fC (Aα ) ∈ Aα for every non-empty subset Aα of X; infinitely many 0 x dx = α + x2 /2, 0 ≤ x ≤ a that here P0 (X) is the class of all subsets of X except ∅. are permissible, and the axiom of choice sanctions Thus, if X = {x1 , x2 , x3 } is a three element set, a this selection. In addition, each y ∈ Y gives rise to possible choice function is given by the family of solution sets Ay = {f − (y) : y ∈ Y } and the real power of the axiom is its assertion that fC ({x1 , x2 , x3 }) = x3 , fC ({x1 , x2 }) = x1 , it is possible to make a choice fC (Ay ) ∈ Ay on every fC ({x2 , x3 }) = x3 , fC ({x3 , x1 }) = x3 , Ay simultaneously; this permits the choice on every fC ({x1 }) = x1 , fC ({x2 }) = x2 , fC ({x3 }) = x3 . Ay of the collection to be made at the same time. It must be appreciated that the axiom is only an ex- istence result that asserts every set to have a choice Pause Tutorial 5 function, even when nobody knows how to construct one in a specific case. Thus, for example, how does √ one pick out the isolated irrationals 2 or π from Figure shows our formulation and solution the uncountable reals? There is no doubt that they [Sengupta, 1997] of the inverse ill-posed problem do exist, for we can construct a right-angled trian- f (x) = y. In sub-diagram X−XB −f (X), the surjec- gle with sides of length 1 or a circle of radius 1. The tion p : X → XB is the counterpart of the quotient axiom tells us that these choices are possible even map Q of Fig. 2 that is known in the present con- though we do not know how exactly to do it; all text as the identification of X with X B (as it iden- that can be stated with confidence is that we can tifies each saturated subset of X with its represen- actually pick up rationals arbitrarily close to these tative point in XB ), with the space (XB , FT{U; p}) irrationals. carrying the identification topology FT{U; p} being The axiom of choice is essentially meaningful known as an identification space. By sub-diagram when X is infinite as illustrated in the last two ex- Y − XB − f (X), the image f (X) of f gets the amples. This is so because even when X is denu- subspace topology15 IT{j; V} from (Y, V) by the in- merable, it would be physically impossible to make clusion j : f (X) → Y when its open sets are an infinite number of selections either all at a time generated as, and only as, j −1 (V ) = V f (X) or sequentially: the Axiom of Choice nevertheless for V ∈ V. Furthermore if the bijection f B con- tells us that this is possible. The real strength and necting XB and f (X) (which therefore acts as a utility of the Axiom however is when X and some 1 : 1 correspondence between their points, imply- or all of its subsets are uncountable as in the case ing that these sets are set-theoretically identical 15 In a subspace A of X, a subset UA of A is open iff UA = A U for some open set U of X. The notion of subspace topology can be formalized with the help of the inclusion map i : A → (X, U) that puts every point of A back to where it came from, thus UA = {UA = A U : U ∈ U} = {i− (U ) : U ∈ U}.
  • 20. 3166 A. Sengupta   indistinguishable which may be considered to be ¦ © ¨¦ © § ¡ ¢  identical in as far as their topological properties are concerned. D 5    4D Remark. It may be of some interest here to spec- 1 C B ulate on the significance of ininality in our work. ! Physically, a map f : (X, U) → (Y, V) between two spaces can be taken to represent an interaction be- tween them and the algebraic and topological char- 5420('%© ¥ ¦ 3 1) $ # ¥  5A@9('6¦   ¦ 3 ) 8 $7© acters of f determine the nature of this interaction. £¤¡ ¥   A simple bijection merely sets up a correspondence, that is an interaction, between every member of X with some member Y , whereas a continuous map Fig. 5. Solution of ill-posed problem f (x) = y, f : X → Y . establishes the correspondence among the special G : Y → XB , a generalized inverse of f because of f Gf = f category of “open” sets. Open sets, as we see in and Gf G = G which follows from the commutativity of the Appendix A.1, are the basic ingredients in the the- diagrams, is a functional selection of the multi-inverse f − : ory of convergence of sequences, nets and filters, and (Y, V) –→ (X, U) f and f are the injective and surjective → the characterization of open sets in terms of conver- restrictions of f ; these will be topologically denoted by their gence, namely that a set G in X is open in it if every generic notations e and q, respectively. net or sequence that converges in X to a point in G is eventually in G, see Appendix A.1, may be inter- except for their names) is image continuous, then preted to mean that such sets represent groupings by Theorem A.2.1 of Appendix 2, so is the asso- of elements that require membership of the group ciation q = fB ◦ p : X → f (X) that associates before permitting an element to belong it; an open saturated sets of X with elements of f (X); this set unlike its complement the closed or selfish set, makes f (X) look like an identification space of X however, does not forbid a net that has been even- by assigning to it the topology FT{U; q}. On the tually in it to settle down in its selfish neighbor, other hand if fB happens to be preimage continu- who nonetheless will never allow such a situation to ous, then XB acquires, by Theorem A.2.2, the initial develop in its own territory. An ininal map forces topology IT{e; V} by the embedding e : X B → Y these well-defined and definite groups in (X, U) and that embeds XB into Y through j ◦ fB , making (Y, V) to interact with each other through f ; this is it look like a subspace of Y .16 In this dual situa- not possible with simple continuity as there may be tion, fB has the highly interesting topological prop- open sets in X that are not derived from those of erty of being simultaneously image and preimage Y and non-open sets in Y whose inverse images are continuous when the open sets of XB and f (X) open in X. It is our hypothesis that the driving force −1 behind the evolution of a system represented by the — which are simply the fB -images of the open sets of f (X) which, in turn, are the f B -images input–output relation f (x) = y is the attainment of these saturated open sets — can be considered of the ininal triple state (X, f, Y ) for the system. to have been generated by fB , and are respec- A preliminary analysis of this hypothesis is to be tively the smallest and largest collection of sub- found in Sec. 4.2. sets of X and Y that makes fB ini(tial-fi)nal con- For ininality of the interaction, it is therefore tinuous [Sengupta, 1997]. A bijective ininal func- necessary to have tion such as fB is known as a homeomorphism FT{U; f } = IT{j; V} and ininality for functions that are neither 1 : 1 (26) nor onto is a generalization of homeomorphism for IT{ f ; V} = FT{U; p}} ; bijections; refer Eqs. (A.47) and (A.48) for a set- in what follows we will refer to the injective and sur- theoretic formulation of this distinction. A homeo- jective restrictions of f by their generic topological morphism f : (X, U) → (Y, V) renders the home- symbols of embedding e and association q, respec- omorphic spaces (X, U) and (Y, V) topologically tively. What are the topological characteristics of f 16 A surjective function is an association iff it is image continuous and an injective function is an embedding iff it is preimage continuous.
  • 21. Toward a Theory of Chaos 3167 in order that the requirements of Eq. (26) be met? £ From Appendix A.1, it should be clear by super- posing the two parts of Fig. 21 over each other that ¡ given q : (X, U) → (f (X), FT{U; q}) in the first of ¤ these equations, IT{j; V} will equal FT{U; q} iff j ¥ is an ininal open inclusion and Y receives FT{U; f }. ¦ In a similar manner, preimage continuity of e re- quires p to be open ininal and f to be preimage con- tinuous if the second of Eq. (26) is to be satisfied. Thus under the restrictions imposed by Eq. (26), the interaction f between X and Y must be such   ¥ ¡ § £ as to give X the smallest possible topology of f - ¤ ¢ ¢ saturated sets and Y the largest possible topology of images of all these sets: f , under these condi-   2x, 0 ≤ x 3/8 tions, is an ininal transformation. Observe that a  Fig. 6. The function f (x) = 3/4, 3/8 ≤ x ≤ 5/8 direct application of parts (b) of Theorems A.2.1   7/6 − 2x/3, 5/8 x ≤ 1. and A.2.2 to Fig. implies that Eq. (26) is satisfied iff fB is ininal, that is iff it is a homeomorphism. Ininality of f is simply a reflection of this as it is neither 1 : 1 nor onto. An injective branch of a function f in this work The f - and p-images of each saturated set of X refers to the restrictions fB and its associated in- −1 are singletons in Y (these saturated sets in X arose, verse fB . in the first place, as f − ({y}) for y ∈ Y ) and in XB , The following example of an inverse ill-posed respectively. This permits the embedding e = j ◦ f B problem will be useful in fixing the notations intro- to give XB the character of a virtual subspace of Y duced above. Let f on [0, 1] be the function shown just as i makes f (X) a real subspace. Hence the in- below. verse images p− (xr ) = f − (e(xr )) with xr ∈ XB , and Then f (x) = y is well-posed for [0, 1/4), and ill- q − (y) = f − (i(y)) with y = fB (xr ) ∈ f (X) are the posed in [1/4, 1]. There are two injective branches same, and are just the corresponding f − images via of f in {[1/4, 3/8) (5/8, 1]}, and f is constant the injections e and i, respectively. G, a left inverse ill-posed in [3/8, 5/8]. Hence the basic component of e, is a generalized inverse of f . G is a general- fB of f can be taken to be fB (x) = 2x for x ∈ −1 ized inverse because the two set-theoretic defining [0, 3/8) having the inverse fB (y) = x/2 with requirements of f Gf = f and Gf G = G for the y ∈ [0, 3/4]. The generalized inverse is obtained generalized inverse are satisfied, as Fig. shows, in by taking [0, 3/4] as a subspace of [0, 1], while the the following forms multiinverse f − follows by associating with every point of the basic domain [0, 1]B = [0, 3/8], the re- jfB Gf = f GjfB G = G . spective equivalent points [3/8]f = [3/8, 5/8] and [x]f = {x, 7/4 − 3x}forx ∈ [1/4, 3/8). Thus the in- In fact the commutativity embodied in these equali- verses G and f − of f are17 ties is self evident from the fact that e = if B is a left inverse of G, that is eG = 1Y . On putting back XB  y, 3  into X by identifying each point of X B with the set  2 y ∈ 0, it came from yields the required set-valued inverse 4 G(y) = , f − , and G may be viewed as a functional selection  3  0, y∈ ,1 of the multiinverse f − .  4 17 If y ∈ R(f ) then f − ({y}) := ∅ which is true for any subset of Y − R(f ). However from the set-theoretic definition of / natural numbers that requires 0 := ∅, 1 = {0}, 2 = {0, 1} to be defined recursively, it follows that f − (y) can be identified with 0 whenever y is not in the domain of f − . Formally, the successor set A+ = A {A} of A can be used to write 0 := ∅, 1 = 0+ = 0 {0}, 2 = 1+ = 1 {1} = {0} {1}3 = 2+ = 2 {2} = {0} {1} {2}, etc. Then the set of natural numbers N is defined to be the intersection of all the successor sets, where a successor set S is any set that contains ∅ and A + whenever A belongs to S. Observe how in the successor notation, countable union of singleton integers recursively define the corresponding sum of integers.
  • 22. 3168 A. Sengupta is the unique matrix representation of the functional  y 1  ,  2 y ∈ 0, inverse a−1 : a(R5 ) → XB extended to Y defined  2 B according to18    y 7 3y  1 3  2, 4 − 2 , y∈ ,  a−1 (b),  if b ∈ R(a)  − 2 4 B f (y) = g(b) = (29)  3, 5 ,  3 0, if b ∈ Y − R(a) ,   8 8 y= 4 that bears comparison with the basic inverse       3  5 1   0, y∈ ,1 ,  4  2 −2 0 0      0 0 0 0 which shows that f − is multivalued. In order to   A−1 (b∗ ) =  − 3 1   avoid cumbersome notations, an injective branch of B 0 0  4 4   f will always refer to a representative basic branch  −1  0 0 0 0   fB , and its “inverse” will mean either f B or G. 0 0 0 0 Example 2.3 (Revisited). The row reduced   echelon form of the augmented matrix (A|b) of b1  b Example 2.3 is  2 ×  : a(R5 ) → XB    2b1  3 1 5b1 b2   1 −3 0 − b2 − b 1  2 2 2 2  between the two-dimensional column and row    1 3 3b1 b2  (A|b) →  0 0 1 − − +  (27) spaces of A which is responsible for the particular    4 4 4 4  solution of Ax = b. Thus G is simply A−1 acting   B 0 0 0 0 0 −2b1 + b3  on its domain a(X) considered a subspace of Y , 0 0 0 0 0 b 1 − b2 + b4 suitably extended to the whole of Y . That it is in- deed a generalized inverse is readily seen through The multifunctional solution x = A− b, with b any the matrix multiplications GAG and AGA that element of Y = R4 not necessarily in the image of can be verified to reproduce G and A, respectively. a, is Comparison of Eqs. (12) and (29) shows that the  3   1  Moore–Penrose inverse differs from ours through −2  −2  the geometrical constraints imposed in its defini-   3 1   0    0  tion, Eqs. (13). Of course, this results in a more       complex inverse (14) as compared to our very simple x = A− b = Gb+x2  0  +x4  1  +x5  − 3  ,       (28); nevertheless it is true that both the inverses  4  4       0 satisfy  1  0     0  1 0 0 0 0  0 1 0 1 0 0 0 E((E(GMP ))T ) =    0 0 0 0 0  with its multifunctional character arising from the arbitrariness of the coefficients x2 , x4 and x5 . The 0 0 0 0 0 generalized inverse = E((E(G))T ) 5 1    2 −2 0 0  where E(A) is the row-reduced echelon form of A.   0  The canonical simplicity of Eq. (28) as compared to 0 0 0   Eq. (14) is a general feature that suggests a more G = −3 1 0 0  : Y → XB (28)   natural choice of bases by the map a than the or-  4 4    thogonal set imposed by Moore and Penrose. This  0 0 0 0   is to be expected since the MP inverse, governed by 0 0 0 0 Eq. (13), is a subset of our less restricted inverse 18 See footnote 17 for a justification of the definition when b is not in R(a).
  • 23. Toward a Theory of Chaos 3169 described by only the first two of (13); more specifi- the basis that diagonalizes an n × n matrix (when cally the difference is made clear in Fig. 4(a) which this is possible) is not the standard “diagonal” or- shows that for any b ∈ R(A), only GMP (b⊥ ) = 0 / thonormal basis of Rn , but a problem-dependent, as compared to G(b) = 0. This seems to imply less canonical, basis consisting of the n eigenvectors that introducing extraneous topological considera- of the matrix. The 0-rows of the inverse of Eq. (28) tions into the purely set-theoretic inversion process result from the three-dimensional null-space vari- may not be a recommended way of inverting, and ables x2 , x4 and x5 , while the 0-columns come from the simple bases comprising the row and null spaces the two-dimensional image-space dependency of b 3 , of A and AT — that are mutually orthogonal just as b4 on b1 and b2 , that is from the last two zero rows those of the Moore–Penrose — are a better choice of the reduced echelon form (27) of the augmented for the particular problem Ax = b than the gen- matrix. eral orthonormal bases that the MP inverse intro- We will return to this theme of the generation duces. These “good” bases, with respect to which of a most appropriate problem-dependent topology the generalized inverse G has a considerably sim- for a given space in the more general context of pler representation, are obtained in a straightfor- chaos in Sec. 4.2. ward manner from the row-reduced forms of A and In concluding this introduction to generalized AT . These bases are inverses we note that the inverse G of f comes very close to being a right inverse: thus even though (a) The column space of A is spanned by the AG = 12 its row-reduced form columns (1, 3, 2, 2)T and (1, 5, 2, 4)T of A that   correspond to the basic columns containing the 1 0 0 0 0 1 0 0 leading 1’s in the row-reduced form of A,   (b) The null space of AT is spanned by the solu- 0 0 0 0   tions (−2, 0, 1, 0)T and (1, −1, 0, 1)T of the 0 0 0 0 equation AT b = 0, (c) The row space of A is spanned by the rows is to be compared with the corresponding less (1, −3, 2, 1, 2) and (3, −9, 10, 2, 9) of A cor- satisfactory responding to the non-zero rows in the row- −1   1 0 2 reduced form of A, 0  1 0 1 (d) The null space of A is spanned by the 0 0 0 0   solutions (3, 1, 0, 0, 0), (−6, 0, 1, 4, 0), and (−2, 0, −3, 0, 4) of the equation Ax = 0. 0 0 0 0 representation of AGMP . The main differences between the natural “good” bases and the MP-bases that are respon- sible for the difference in the form of inverses, is 3. Multifunctional Extension of that the latter have the additional restrictions of Function Spaces being orthogonal to each other (recall the orthog- The previous section has considered the solution of onality property of the Q-matrices), and the more ill-posed problems as multifunctions and has shown severe of basis vectors mapping onto basis vectors how this solution may be constructed. Here we in- according to Axi = σi bi , i = 1, . . . , r, where the troduce the multifunction space Multi | (X) as the {xi }i=1 and {bj }j=1 are the eigenvectors of AT A n m first step toward obtaining a smallest dense ex- and AAT respectively and (σi )r are the positive i=1 tension Multi(X) of the function space Map(X). square roots of the non-zero eigenvalues of A T A (or Multi| (X) is basic to our theory of chaos [Sengupta of AAT ), with r denoting the dimension of the row Ray, 2000] in the sense that a chaotic state of or column space. This is considered as a serious re- a system can be fully described by such an inde- striction as the linear combination of the basis {b j } terminate multifunctional state. In fact, multifunc- that Axi should otherwise have been equal to, al- tions also enter in a natural way in describing the lows a greater flexibility in the matrix representa- spectrum of nonlinear functions that we consider in tion of the inverse that shows up in the structure of Sec. 6; this is required to complete the construc- G. These are, in fact, quite general considerations in tion of the smallest extension Multi(X) of the func- the matrix representation of linear operators; thus tion space Map(X). The main tool in obtaining the
  • 24. 3170 A. Sengupta space Multi| (X) from Map(X) is a generalization of (fα )α∈D converge pointwise in Y . Explicitly, this of the technique of pointwise convergence of con- is the subset of Y on which subnets of injective tinuous functions to (discontinuous) functions. In branches of (fα )α∈D in Map(Y, X) combine to form the analysis below, we consider nets instead of se- a net of functions that converge pointwise to a fam- quences as the spaces concerned, like the topology ily of limit functions G : R− → X. Depending on of pointwise convergence, may not be first count- the nature of (fα )α∈D , there may be more than one able, Appendix A.1. R− with a corresponding family of limit functions on each of them. To simplify the notation, we will usually let G : R− → X denote all the limit func- 3.1. Graphical convergence of a net tions on all the sets R− . of functions If we consider cofinal rather than residual sub- Let (X, U) and (Y, V) be Hausdorff spaces and sets of D then corresponding D+ and R+ can be (fα )α∈D : X → Y be a net of piecewise continuous expressed as functions, not necessarily with the same domain or range, and suppose that for each α ∈ D there is D+ = {x ∈ X : ((fν (x))ν∈Cof(D) converges in − a finite set Iα = {1, 2, . . . Pα } such that fα has Pα (Y, V))} (32) functional branches possibly with different domains; R+ = {y ∈ Y : (∃i ∈ Iν )((gνi (y))ν∈Cof(D) obviously Iα is a singleton iff f is a injective. For each α ∈ D, define functions (gαi )i∈Iα : Y → X converges in (X, U))} . (33) such that It is to be noted that the conditions D + = D− and I fα gαi fα = fαi i = 1, 2, . . . Pα , R+ = R− are necessary and sufficient for the Kura- I towski convergence to exist. Since D + and R+ differ where fαi is a basic injective branch of fα on I I from D− and R− only in having cofinal subsets of D some subset of its domain: gαi fαi = 1X on D(fαi ), I g replaced by residual ones, and since residual sets are fαi αi = 1Y on D(gαi ) for each i ∈ Iα . The use of also cofinal, it follows that D− ⊆ D+ and R− ⊆ R+ . nets and filters is dictated by the fact that we do The sets D− and R− serve for the convergence of not assume X and Y to be first countable. In the a net of functions just as D+ and R+ are for the application to the theory of dynamical systems that convergence of subnets of the nets (adherence). The follows, X and Y are compact subsets of R when the latter sets are needed when subsequences are to be use of sequences suffice. considered as sequences in their own right as, for In terms of the residual and cofinal subsets example, in dynamical systems theory in the case Res(D) and Cof(D) of a directed set D (Defini- of ω-limit sets. tion A.1.7), with x and y in the equations below As an illustration of these definitions, consider being taken to belong to the required domains, de- the sequence of injective functions on the interval fine subsets D− of X and R− of Y as [0, 1] fn (x) = 2n x, for x ∈ [0, 1/2n ], n = 0, 1, 2 . . . . D− = {x ∈ X : ((fν (x))ν∈D converges in (Y, V))} Then D0.2 is the set {0, 1, 2} and only D0 is even- (30) tual in D. Hence D− is the single point set {0}. On the other hand Dy is eventual in D for all y and R− R− = {y ∈ Y : (∃i ∈ Iν )((gνi (y))ν∈D converges in is [0, 1]. (X, U))} (31) Thus, Definition 3.1 (Graphical Convergence of a net of D− is the set of points of X on which the values functions). A net of functions (fα )α∈D : (X, U) → of a given net of functions (fα )α∈D converge point- (Y, V) is said to converge graphically if either D − = wise in Y . Explicitly, this is the subset of X on ∅ or R− = ∅; in this case let F : D− → Y and which subnets19 in Map(X, Y ) combine to form a G : R− → X be the entire collection of limit func- net of functions that converge pointwise to a limit tions. Because of the assumed Hausdorffness of X function F : D− → Y . and Y , these limits are well defined. R− is the set of points of Y on which the values The graph of the graphical limit M of the net G of the nets in X generated by the injective branches (fα ) : (X, U) → (Y, V) denoted by fα → M, is the 19 A subnet is the generalized uncountable equivalent of a subsequence; for the technical definition, see Appendix A.1.
  • 25. Toward a Theory of Chaos 3171 subset of D− × R− that is the union of the graphs spaces X and Y to be a consequence of both the di- of the function F and the multifunction G − rect interaction represented by f : X → Y and also GM = G F G G− the inverse interaction f − : Y –→ X, and our formu- → lation of pointwise biconvergence is a formalization where of this idea. Thus the basic examples (1) and (2) GG− = {(x, y) ∈ X × Y : (y, x) ∈ GG ⊆ Y × X}. below produce multifunctions instead of discontin- uous functions that would be obtained by the usual pointwise limit. Begin Tutorial 6: Graphical Convergence Example 3.1 The following two examples are basic to the un- derstanding of the graphical convergence of func- (1)   0, −1 ≤ x ≤ 0 tions to multifunctions and were the examples    that motivated our search of an acceptable tech-   1 nique that did not require vertical portions of fn (x) = nx, 0 ≤ x ≤ : [−1, 1] → [0, 1]  n limit relations to disappear simply because they    1 were non-functions: the disturbing question that  1,  ≤x≤1 n needed an answer was how not to mathemati- cally sacrifice these extremely significant physi- y 1 cal components of the limiting correspondences. gn (y) = : [0, 1] → 0, n n Furthermore, it appears to be quite plausible to expect a physical interaction between two Then 0, −1 ≤ x ≤ 0 F (x) = on D− = D+ = [−1, 0] (0, 1] 1, 0x≤1 G(y) = 0 on R− = [0, 1] = R+ . The graphical limit is ([−1, 0], 0) (0, [0, 1]) not converge graphically because in this case both ((0, 1], 1). the sets D− and R− are empty. The power of (2) fn (x) = nx for x ∈ [0, 1/n] gives gn (y) = y/n : graphical convergence in capturing multifunctional [0, 1] → [0, 1/n]. Then limits is further demonstrated by the example of ∞ the sequence (sin nπx)n=1 that converges to 0 both F (x) = 0 on D− = {0} = D+ , 1-integrally and test-functionally, Eqs. (3) and (4). It is necessary to understand how the concepts G(y) = 0 on R− = [0, 1] = R+ . of eventually in and frequently in of Appendix A.2 apply in examples (a) and (b) of Fig. 7. In these The graphical limit is (0, [0, 1]). two examples we have two subsequences one each In these examples that we consider to be the proto- for the even indices and the other for the odd. types of graphical convergence of functions to mul- For a point-to-point functional relation, this would tifunctions, G(y) = 0 on R− because gn (y) → 0 mean that the sequence frequents the adherence set for all y ∈ R− . Compare the graphical multifunc- adh(x) of the sequence (xn ) but does not converge tional limits with the corresponding usual pointwise anywhere as it is not eventually in every neigh- functional limits characterized by discontinuity at borhood of any point. For a multifunctional limit x = 0. Two more examples from Sengupta and Ray however it is possible, as demonstrated by these [2000] that illustrate this new convergence princi- examples, for the subsequences to be eventually ple tailored specifically to capture one-to-many re- in every neighborhood of certain subsets common lations are shown in Fig. 7 which also provides an to the eventual limiting sets of the subsequences; example in Fig. 7(c) of a function whose iterates do this intersection of the subsequential limits is now
  • 26. 3172 A. Sengupta 11+ 1/n + 1/n 1/n 1/n 2 − 1/n 1/n 1/n 1/n 1/n − 2 − 2 − 1/n 2 1.5 1.5 +1 + 1/n 1 1/n n n even even eveneven 1.5 1.5 n n nneveneven nn even even 1 1 1 1 1 + 2/n + 2/n 1 +12/n + 2/n 1 3 + 1/n 1/n 3++ 0 3 + 1/n 1/n 3 0 0 0 1 1 1 2 2 2 1 2 3 1 2 1 1 1 2 2 2 3 3 3 n odd odd nn -0.5 -0.5 -0.5 -0.5 n odd odd n odd odd n n n odd odd (a) (a) (b) (b) (a) (a) (a) (b) (b) (b) 1 1 1 1 12 iterates of −0.05 + x − x2− −2x2 iterates of −0.05 + x x 1212 iterates of 12 iterates of 0.7 + x + x2+ x2 x2 12 iterates of 0.7 12 iterates of −0.05−0.05 + 2 +x−x 12 12 iterates of 0.7 + x + 1 1 1 iterates of 0.7 + x + x 2 0 0 0 α 1 αα 0 α 1 1 1 -1 -1 -1 1 -1 0 0 0 12 1212 0 12 -2 -2 -2 -2 -1 -1 -1 -1 -3 -3 -3 -3 -1 -1 -1 0 0 0 1 1 1 2 2 2 a a a c c c -1 0 1 2 a c (c) (c) (c) (c) (d) (d) (d) (c) (d) (d) 1 for 0 ≤ x ≤ 1 Fig. 7. The graphical limits are: (a) F (x) = on D− = [0, 1] (1, 2], and G(y) = 1 on R− = [0, 1]. Also 0 for 1 x ≤ 2 1 on R+ = [0, 3/2] G= . 1 on R+ = [−1/2, 1] (b) F (x) = 1 on D− = {0} and G(y) = 0 on R− = {1}. Also F (x) = −1/2, 0, 1, 3/2 respectively on D+ = (0, 3], {2}, {0}, (0, 2) and G(y) = 0, 0, 2, 3 respectively on R+ = (−1/2, 1], [1, 3/2), [0, 3/2), [−1/2, 0). (c) For f (x) = −0.05 + x − x2 , no graphical limit as D− = ∅ = R− . (d) For f (x) = 0.7 + x − x2 , F (x) = α on D− = [a, c], G1 (y) = a and G2 (y) = c on R− = (−∞, α]. Notice how the two fixed points and their equivalent images define the converged limit rectangular multi. As in example (1) one has D − = D+ ; also R− = R + . defined to be the limit of the original sequence. A ous set of equations (sequence) may have distinct similar situation obtains, for example, in the solu- solutions (limits), the solution of the equations is tion of simultaneous equations: The solution of the their common point of intersection. equation a11 x1 + a12 x2 = b1 for one of the vari- 1 1 1 Considered as sets in X × Y , the discussion of ables x2 say with a12 = 0, is the set represented 1 convergence of a sequence of graphs f n : X → Y by the straight line x2 = m1 x1 + c1 for all x1 in would be incomplete without a mention of the con- its domain, while for a different set of constants vergence of a sequence of sets under the Hausdorff a21 , a22 and b2 the solution is the entirely differ- metric that is so basic in the study of fractals. In ent set x2 = m2 x1 + c2 , under the assumption that this case, one talks about the convergence of a se- m1 = m2 and c1 = c2 . Thus even though the indi- quence of compact subsets of the metric space R n vidual equations (subsequences) of the simultane- so that the sequences, as also the limit points that
  • 27. Toward a Theory of Chaos 3173 are the fractals, are compact subsets of R n . Let K topology of pointwise convergence iff (f α ) converges denote the collection of all nonempty compact sub- pointwise to f in the sense that fα (x) → f (x) in Y sets of Rn . Then the Hausdorff metric dH between for every x in X. two sets on K is defined to be Proof. Necessity. First consider fα → f in dH (E, F ) = max{δ(E, F ), δ(F, E)} E, F ∈ K , (Map(X, Y ), T ). For an open neighborhood V of f (x) in Y with x ∈ X, let B(x; V ) be a local neigh- where borhood of f in (Map(X, Y ), T ), see Eq. (A.6) in δ(E, F ) = max min x − y 2 Appendix A.1. By assumption of convergence, (f α ) x∈E y∈F must eventually be in B(x; V ) implying that f α (x) is δ(E, F ) is the non-symmetric 2-norm in R n . is eventually in V . Hence fα (x) → f (x) in Y . The power and utility of the Hausdorff distance is Sufficiency. Conversely, if fα (x) → f (x) in best understood in terms of the dilations E + ε := Y for every x ∈ X, then for a finite collection n of points (xi )I of X (X may itself be uncount- x∈E Dε (x) of a subset E of R by ε where Dε (x) i=1 is a closed ball of radius ε at x; physically a dilation able) and corresponding open sets (V i )I in Y with i=1 of E by ε is a closed ε-neighborhood of E. Then a f (xi ) ∈ Vi , let B((xi )I ; (Vi )I ) be an open neigh- i=1 i=1 fundamental property of dH is that dH (E, F ) ≤ ε borhood of f . From the assumed pointwise conver- iff both E ⊆ F + ε and F ⊆ E + ε hold simultane- gence fα (xi ) → f (xi ) in Y for i = 1, 2, . . . , I, it ously which leads [Falconer, 1990] to the interesting follows that (fα (xi )) is eventually in Vi for every consequence that (xi )I . Because D is a directed set, the existence of i=1 ∞ If (Fn )n=1 and F are nonempty compact sets, a residual applicable globally for all i = 1, 2, . . . , I then limn→∞ Fn = F in the Hausdorff metric iff is assured leading to the conclusion that f α (xi ) ∈ Vi Fn ⊆ F +ε and F ⊆ Fn +ε eventually. Furthermore eventually for every i = 1, 2, . . . , I. Hence f α ∈ ∞ if (Fn )n=1 is a decreasing sequence of elements of a B((xi )I ; (Vi )I ) eventually; this completes the i=1 i=1 filter-base in Rn , then the nonempty and compact demonstration that fα → f in (Map(X, Y ), T ), limit set F is given by and thus of the proof. ∞ lim Fn = F = Fn . End Tutorial 6 n→∞ n=1 Note that since Rn is Hausdorff, the assumed com- pactness of Fn ensures that they are also closed in 3.2. The extension Multi| (X, Y ) of Rn ; F , therefore, is just the adherent set of the Map (X, Y ) filter-base. In the deterministic algorithm for the In this section we show how the topological treat- generation of fractals by the so-called iterated func- ment of pointwise convergence of functions to func- tion system (IFS) approach, Fn is the inverse im- tions given in Example A.1.1 of Appendix 1 can be age by the nth iterate of a non-injective function f generalized to generate the boundary Multi | (X, Y ) having a finite number of injective branches and between Map(X, Y ) and Multi(X, Y ); here X converging graphically to a multifunction. Under and Y are Hausdorff spaces and Map(X, Y ) and the conditions stated above, the Hausdorff metric Multi(X, Y ) are respectively the sets of all func- ensures convergence of any class of compact sub- tional and non-functional relations between X and sets in Rn . It appears eminently plausible that our Y . The generalization we seek defines neighbor- multifunctional graphical convergence on Map(R n ) hoods of f ∈ Map(X, Y ) to consist of those func- implies Hausdorff convergence on Rn : in fact point- tional relations in Multi(X, Y ) whose images at any wise biconvergence involves simultaneous conver- point x ∈ X lies not only arbitrarily close to f (x) gence of image and preimage nets on Y and X, (this generates the usual topology of pointwise con- respectively. Thus confining ourselves to the sim- vergence TY of Example A.1.1) but whose inverse pler case of pointwise convergence, if (f α )α∈D is a images at y = f (x) ∈ Y contain points arbitrar- net of functions in Map(X, Y ), then the following ily close to x. Thus the graph of f must not only theorem expresses the link between convergence in lie close enough to f (x) at x in V , but must addi- Map(X, Y ) and in Y . tionally be such that f − (y) has at least branch in Theorem 3.1. A net of functions (fα )α∈D con- U about x; thus f is constrained to cling to f as verges to a function f in (Map(X, Y ), T ) in the the number of points on the graph of f increases
  • 28. 3174 A. Sengupta   © ©   © % ! # $ !   ! 9¥' 75 83 6 C @1' 75 80 6 @1' 75 84 6 BA' 75 8( 6 ¢ £¡ ¤ ¥¡ ¦ ¥¡ § ¨¡   ¨21' ¥' )' 3 ' 0 4 ( ! (a) (b) Fig. 8. The power of graphical convergence, illustrated for Example 3.1 (1), shows a local neighborhood of the functions x and 2x in (a) and (b) at the four points (xi )4 with corresponding neighborhoods (Ui )4 and (Vi )i=1 at (xi , f (xi )) in i=1 i=1 4 R in the X and Y directions respectively, see Eqs. (34) and (A.6) for the notations. (a) shows a function g in a pointwise neighborhood of f determined by the open sets Vi , while (b) shows g in a graphical neighborhood of f due to both Ui and Vi . A comparison of these figures demonstrates how the graphical neighborhood forces functions close to f remain closer to it than if they were in its pointwise neighborhood. This property is clearly visible in (a) where g, if it were to be in a graphical neighborhood of f , would be more faithful to it by having to be also in U2 and U4 . Thus in this case not only must the images j j f (xij ) → f (xi ) as Vi decreases, but also the preimages xij → xi with shrinking Ui . It is this simultaneous convergence of both images and preimages at every x that makes graphical convergence a natural candidate for multifunctional convergence of functions. with convergence and, unlike in the situation of sim- for every choice of α ∈ D, is a base T B of ple pointwise convergence, no gaps in the graph of (Map(X, Y ), T ). Here the directed set D is used the limit object is permitted not only, as in Exam- as an indexing tool because, as pointed out in Ex- ple A.1.1 on the domain of f , but simultaneously ample A.1.1, the topology of pointwise convergence on its range too. We call the resulting generated is not first countable. topology the topology of pointwise biconvergence on In a manner similar to Eq. (34), the open sets Map(X, Y ), to be denoted by T . Thus for any given ˆ of (Multi(X, Y ), T ), where Multi(X, Y ) are mul- integer I ≥ 1, the generalization of Eq. (A.6) gives tifunctions with only countably many values in Y for i = 1, 2, . . . , I, the open sets of (Map(X, Y ), T ) for every point of X (so that we exclude continuous to be regions from our discussion except for the “vertical lines” of Multi| (X, Y )), can be defined as B((xi ), (Vi ); (yi ), (Ui )) = {g ∈ Map(X, Y ) : (g(xi ) ∈ Vi ) ˆ B((xi ), (Vi ); (yi ), (Ui )) ∧ (g − (yi ) Ui = ∅), i = 1, 2, . . . , I} , (34) = {G ∈ Multi(X, Y ) : (G(xi ) Vi = ∅) ∧ (G − (yi ) Ui = ∅)} , (36) where (xi )i=1 , (Vi )I I i=1 are as in that example, I (yi )i=1 ∈ Y , and the corresponding open sets where (Ui )i=1 in X are chosen arbitrarily.20 A local base I at f , for (xi , yi ) ∈ Gf , is the set of functions G − (y) = {x ∈ X : y ∈ G(x)} . of (34) with yi = f (xi ) and the collection of all and (xi )I I I i=1 ∈ D(M), (Vi )i=1 ; (yi )i=1 ∈ R(M), local bases I (Ui )i=1 are chosen as in the above. The topology Bα = B((xi )i=1 , (Vi )Iα ; (yi )Iα , (Ui )Iα ) , Iα (35) ˆ T of Multi(X, Y ) is generated by the collection of i=1 i=1 i=1 20 Equation (34) is essentially the intersection of the pointwise topologies (A.6) due to f and f − .
  • 29. Toward a Theory of Chaos 3175 ˆ all local bases Bα for every choice of α ∈ D, and it is persets of all elements of F B; see Appendix A.1) and not difficult to see from Eqs. (34) and (36), that the thereby the filter-base ˆ ˆ restriction T |Map(X, Y ) of T to Map(X, Y ) is just T . ˆ = {B = B ˆ FB {m} : B ∈F B} ˆ ˆ Henceforth T and T will be denoted by the ˆ on M ; this filter-base at m can also be obtained same symbol T , and convergence in the topology of pointwise biconvergence in (Multi(X, Y ), T ) will independently from Eq. (36). Obviously Fˆ is an B extension of F B on Mˆ and F B is the filter induced be denoted by , with the notation being derived from Theorem 3.1. on M by Fˆ We may also consider the filter-base B. to be a topological base on M that defines a coarser Definition 3.2 (Functionization of a multifunction). topology T on M (through all unions of members A net of functions (fα )α∈D in Map(X, Y ) converges of F B) and hence the topology in (Multi(X, Y ), T ), fα M, if it biconverges ˆ ˆ T = {G = G {m} : G ∈ T } ˆ pointwise in (Map(X, Y ), T ∗ ). Such a net of func- tions will be said to be a functionization of M. ˆ ˆ on M to be the topology associated with F. A finer topology on M ˆ may be obtained by adding to T ˆ Theorem 3.2. Let (fα )α∈D be a net of functions in all the discarded elements of T0 that do not satisfy Map(X, Y ). Then FIP. It is clear that m is on the boundary of M ˆ because every neighborhood of m intersects M by ˆ G ˆ ˆ fα → M ⇔ f α M. construction; thus (M, T ) is dense in ( M, T ) which is the required topological extension of (M, T ). Proof. If (fα ) converges graphically to M then ei- In the present case, a filter-base at f ∈ ther D− or R− is non-empty; let us assume both of Map(X, Y ) is the neighborhood system F Bf at f them to be so. Then the sequence of functions (f α ) given by decreasing sequences of neighborhoods converges pointwise to a function F on D − and to (Vk ) and (Uk ) of f (x) and x, respectively, and the functions G on R− , and the local basic neighbor- ˆ filter F is the neighborhood filter Nf G where hoods of F and G generate the topology of point- G ∈ Multi| (X, Y ). We shall present an alternate, wise biconvergence. and perhaps more intuitively appealing, description Conversely, for pointwise biconvergence on X of graphical convergence based on the adherence set and Y , R− and D− must be non-empty. of a filter in Sec. 4.1. As more serious examples of the graphical con- vergence of a net of functions to multifunction than Observe that the boundary of Map(X, Y ) in those considered above, Fig. 9 shows the first four the topology of pointwise biconvergence is a “line iterates of the tent map parallel to the Y -axis”. We denote this closure of  Map(X, Y ) as 1  2x, 0≤x  2  t(x) = (t1 = t) . Definition 3.3. Multi| ((X, Y ), T ) = Cl(Map((X,  2(1 − x), 1 ≤ x ≤ 1  Y ), T )).  2 defined on [0.1] and the sine map fn = The sense in which Multi| (X, Y ) is the smallest | sin(2n−1 πx)|, n = 1, . . . , 4 with domain [0, 1]. closed topological extension of M = Map(X, Y ) is These examples illustrate the important gener- the following, refer to Theorems A.1.4 and its proof. alization that periodic points may be replaced by the Let (M, T0 ) be a topological space and suppose that more general equivalence classes where a sequence ˆ of functions converges graphically; this generaliza- M =M {m} ˆ tion based on the ill-posed interpretation of dynam- is obtained by adjoining an extra point to M ; here ical systems is significant for non-iterative systems M = Map(X, Y ) and m ∈ Cl(M ) is the multifunc- ˆ as in second example above. The equivalence classes ˆ tional limit in M = Multi| (X, Y ). Treat all open of the tent map for its two fixed points 0 and 2/3 sets of M generated by local bases of the type (35) generated by the first four iterates are with finite intersection property as a filter-base F B 1 1 3 1 5 3 7 [0]4 = 0, , , , , , , ,1 on X that induces a filter F on M (by forming su- 8 4 8 2 8 4 8
  • 30. 3176 A. Sengupta 1 1 1 1 0 iterates of tent map 1 Graph Graph of |sine|4maps maps 0First 4First 4 iterates of tent map 0 1 0 of first 4 first |sine| 1 1 (a) (a) (a) (b) (b) (b) Fig. 9. The first four iterates of (a) tent and (b) | sin(2n−1 πx)| maps show the formal similarity of the dynamics of these functions. It should be noted, as shown in Fig. 7, that although sin(nπx)∞ fails to converge at any point other than 0 and n=1 1, the subsequence sin(2n−1 πx)∞ does converge graphically on a set dense in [0, 1]. n=1 2 1 1 3 1 5 functions. It is to be noted that the number of equiv- = c, c, c, c, c, c, alent fixed points in a class increases with the num- 3 4 8 4 8 2 8 ber of iterations k as 2k−1 + 1; this increase in the 3 7 c, c, 1 − c degree of ill-posedness is typical of discrete chaotic 4 8 systems and can be regarded as a paradigm of chaos where c = 1/24. If the moduli of the slopes of the generated by the convergence of a family of func- graphs passing through these equivalent fixed points tions. are greater than 1 then the graphs converge to The mth iterate tm of the tent map has 2m fixed multifunctions and when these slopes are less than points corresponding to the 2m injective branches 1 the corresponding graphs converge to constant of tm  j−1   m , j = 1, 3, . . . , (2m − 1) 2 −1 xmj = tm (xmj ) = xmj , j = 1, 2, . . . , 2m .  j  , j = 2, 4, . . . , 2m 2m + 1 Let Xm be the collection of these 2m fixed points higher iterates tn for m = in with i = 1, 2, . . . (thus X1 = {0, 2/3}), and denote by [Xm ] the set where these subsequences remain fixed. For exam- of the equivalent points, one coming from each of ple, the fixed points 2/5 and 4/5 produced respec- the injective branches, for each of the fixed points: tively by the second and fourth injective branches thus of t2 , are also fixed for the seventh and thirteenth 1 1 branches of t4 . For the shift map 2xmod(1) on [0, 1], 2 D− = [X1 ] = [0], D− = {[0], [1]} where [0] = ∞ {(i − 1)/2m : m=1 3 ∞ i = 1, 2, . . . , 2m } and [1] = m=1 {i/2m : i = 2 2 4 1, 2, . . . , 2m }. [X2 ] = [0], , , It is useful to compare the graphical conver- 5 3 5 gence of (sin(πnx))∞ to [0, 1] at 0 and to 0 at 1 n=1 and D+ = ∞ [Xm ] is a non-empty countable m=1 with the usual integral and test-functional conver- set dense in X at each of which the graphs of the gences to 0; note that the point 1/2, for example, sequence (tm ) converge to a multifunction. New belongs to D+ and not to D− = {0, 1} because it sets [Xn ] will be formed by subsequences of the is frequented by even n only. However for the sub-
  • 31. Toward a Theory of Chaos 3177 sequence (f2m−1 )m∈Z+ , 1/2 is in D− because if the properties (ER1)–(ER3) of an equivalence relation, graph of f2m−1 passes through (1/2, 0) for some m, Tutorial 1) then so do the graphs for all higher values. There- fore [0] = ∞ {i/2m−1 : i = 0, 1, . . . , 2m−1 } is the m=1 (OR1) Reflexive, that is (∀x ∈ X)(x x). equivalence class of (f2m−1 )∞ and this sequence m=1 (OR2) Antisymmetric: (∀x, y ∈ X)(x y∧y converges to [−1, 1] on this set. Thus our extension x ⇒ x = y). Multi(X) is distinct from the distributional exten- (OR3) Transitive, that is (∀x, y, z ∈ X)(x y ∧y sion of function spaces with respect to test func- z ⇒ x z). Any notion of order on a set X in the tions, and is able to correctly generate the patho- sense of one element of X preceding another should logical behavior of the limits that are so crucially possess at least this property. vital in producing chaos. The relation is a preorder if it is only reflexive 4. Discrete Chaotic Systems are and transitive, that is if only (OR1) and (OR3) are Maximally Ill-posed true. If the hypothesis of (OR2) is also satisfied by a preorder, then this induces an equivalence re- The above ideas apply to the development of a cri- lation ∼ on X according to (x y) ∧ (y x) ⇔ terion for chaos in discrete dynamical systems that x ∼ y that evidently is actually a partial order iff is based on the limiting behavior of the graphs of x ∼ y ⇔ x = y. For any element [x] ∈ X/ ∼ of the a sequence of functions (fn ) on X, rather than on induced quotient space, let ≤ denote the generated the values that the sequence generates as is cus- order in X/ ∼ so that tomary. For the development of the maximality of ill-posedness criterion of chaos, we need to refresh x y ⇔ [x] ≤ [y] ; ourselves with the following preliminaries. then ≤ is a partial order on X/ ∼. If every two ele- ments of X are comparable, in the sense that either x1 x2 or x2 x1 for all x1 , x2 ∈ X, then X is said to be a totally ordered set or a chain. A to- Resume Tutorial 5: Axiom of Choice tally ordered subset (C, ) of a partially ordered and Zorn’s Lemma set (X, ) with the ordering induced from X, is Let us recall from the first part of this Tutorial that known as a chain in X if for nonempty subsets (Aα )α∈D of a nonempty set C = {x ∈ X : (∀c ∈ X)(c x∨x c)} . (37) X, the Axiom of Choice ensures the existence of a set A such that A Aα consists of a single element The most important class of chains that we are con- for every α. The choice axiom has far reaching con- cerned with in this work is that on the subsets P(X) sequences and a few equivalent statements, one of of a set (X, ⊆) under the inclusion order; Eq. (37), which the Zorn’s lemma that will be used immedi- as we shall see in what follows, defines a family of ately in the following, is the topic of this resumed chains of nested subsets in P(X). Thus while the Tutorial. The beauty of the Axiom, and of its equiv- relation in Z defined by n1 n2 ⇔ |n1 | ≤ |n2 | alents, is that they assert the existence of mathe- with n1 , n2 ∈ Z preorders Z, it is not a partial matical objects that, in general, cannot be demon- order because although −n n and n −n for strated and it is often believed that Zorn’s lemma any n ∈ Z, it is does not follow that −n = n. is one of the most powerful tools that a mathemati- A common example of partial order on a set of cian has available to him that is “almost indispens- sets, for example on the power set P(X) of a set able in many parts of modern pure mathematics” X (see footnote 23), is the inclusion relation ⊆: the with significant applications in nearly all branches ordered set X = (P({x, y, z}), ⊆) is partially or- of contemporary mathematics. This “lemma” talks dered but not totally ordered because, for exam- about maximal (as distinct from “maximum”) ele- ple, {x, y} ⊆ {y, x}, or {x} is not comparable to ments of a partially ordered set, a set in which some {y} unless x = y; however C = {{∅, {x}, {x, y}} notion of x1 “preceding” x2 for two elements of the does represent one of the many possible chains of set has been defined. X . Another useful example of partial order is the A relation on a set X is said to be a partial following: Let X and (Y, ≤) be sets with ≤ or- order (or simply an order) if it is (compare with the dering Y , and consider f, g ∈ Map(X, Y ) with
  • 32. 3178 A. Sengupta D(f ), D(g) ⊆ X. Then which requires the upper bound u to be larger than all members of A, with the corresponding lower (D(f ) ⊆ D(g))(f = g|D(f ) ) ⇔ f g bounds of A being defined in a similar manner. Of (D(f ) = D(g))(R(f ) ⊆ R(g)) ⇔ f g (38) course, it is again not necessary that the elements (∀x ∈ D(f ) = D(g))(f (x) ≤ g(x)) ⇔ f g of A be comparable to each other, and it should be clear from Eqs. (41) and (42) that when an up- define partial orders on Map(X, Y ). In the last case, per bound of a set is in the set itself, then it is the the order is not total because any two functions maximum element of the set. If the upper (lower) whose graphs cross at some point in their common bounds of a subset (A, ) of a set (X, ) has a least domain cannot be ordered by the given relation, (greatest) element, then this smallest upper bound while in the first any f whose graph does not coin- (largest lower bound) is called the least upper bound cide with that of g on the common domain is not (greatest lower bound) or supremum (infimum) of A comparable to it by this relation. in X. Combining Eqs. (41) and (42) then yields Let (X, ) be a partially ordered set and let A be a subset of X. An element a+ ∈ (A, ) is said sup A = {a← ∈ ΩA : a← u∀u ∈ (ΩA , )} X to be a maximal element of A with respect to if (43) inf A = {→ a ∈ ΛA : l → a∀l ∈ (ΛA , )} (∀a ∈ (A, ))(a+ a) ⇒ a = a+ , (39) X where ΩA = {u ∈ X : (∀a ∈ A)(a u)} and that is, iff there is no a ∈ A with a = a+ and ΛA = {l ∈ X : (∀a ∈ A)(l a)} are the sets of all a a+ .21 Expressed otherwise, this implies that an upper and lower bounds of A in X. Equation (43) element a+ of a subset A ⊆ (X, ) is maximal in may be expressed in the equivalent but more trans- (A, ) iff it is true that parent form as (a a+ ∈ A)(for every a ∈ (A, ) a← = sup A ⇔ (a ∈ A ⇒ a a← ) comparable to a+ ) ; (40) X ∧ (a0 a← ⇒ a0 b a← for some b ∈ A) thus a+ in A is a maximal element of A iff it is strictly greater than every other comparable element → a = inf A ⇔ (a ∈ A ⇒→ a a) X of A. This of course does not mean that each ele- ∧ (→ a a 1 ⇒→ a b a1 for some b ∈ A) ment a of A satisfies a a+ because every pair (44) of elements of a partially ordered set need not be comparable: in a totally ordered set there can be to imply that a← (→ a) is the upper (lower) bound of at most one maximal element. In comparison, an A in X which precedes (succeeds) every other upper element a∞ of a subset A ⊆ (X, ) is the unique (lower) bound of A in X. Notice that uniqueness in maximum (largest, greatest, last) element of A iff the definitions above is a direct consequence of the uniqueness of greatest and least elements of a set. (a a∞ ∈ A)(for every a ∈ (A, )) , (41) It must be noted that whereas maximal and max- implying that a∞ is the element of A that is strictly imum are properties of the particular subset and larger than every other element of A. As in the case have nothing to do with anything outside it, up- of the maximal, although this also does not require per and lower bounds of a set are defined only with all elements of A to be comparable to each other, respect to a superset that may contain it. it does require a∞ to be larger than every element The following example, beside being useful in of A. The dual concepts of minimal and minimum Zorn’s lemma, is also of great significance in fix- can be similarly defined by essentially reversing the ing some of the basic ideas needed in our future roles of a and b in relational expressions like a b. arguments involving classes of sets ordered by the The last concept needed to formalize Zorn’s inclusion relation. lemma is that of an upper bound: For a subset Example 4.1. Let X = P({a, b, c}) be ordered (A, ) of a partially ordered set (X, ), an element by the inclusion relation ⊆. The subset A = u of X is an upper bound of A in X iff P({a, b, c}) − {a, b, c} has three maximals {a, b}, (a u ∈ (X, ))(for every a ∈ (A, )) (42) {b, c} and {c, a} but no maximum as there is no 21 If is an order relation in X then the strict relation in X corresponding to , given by x y ⇔ (x y) ∧ (x = y), is not an order relation because unlike , is not reflexive even though it is both transitive and asymmetric.
  • 33. Toward a Theory of Chaos 3179 A∞ ∈ A satisfying A A∞ for every A ∈ A, The statement of Zorn’s lemma and its proof while P({a, b, c}) − ∅ the three minimals {a}, {b} can now be completed in three stages as follows. and {c} but no minimum. This shows that a sub- For Theorem 4.1 below that constitutes the most set of a partially ordered set may have many max- significant technical first stage, let g be a function imals (minimals) without possessing a maximum on (X, ) that assigns to every x ∈ X an immediate (minimum), but a subset has a maximum (mini- successor y ∈ X such that mum) iff this is its unique maximal (minimal). If M(x) = {y x : ∃x∗ ∈ X satisfying x x∗ y} A = {{a, b}, {a, c}}, then every subset of the in- tersection of the elements of A, namely {a} and ∅, are all the successors of x in X with no element are lower bounds of A, and all supersets in X of the of X lying strictly between x and y. Select a rep- union of its elements — which in this case is just resentative of M (x) by a choice function f C such {a, b, c} — are its upper bounds. Notice that while that the maximal (minimal) and maximum (minimum) g(x) = fC (M(x)) ∈ M (x) are elements of A, upper and lower bounds need not be contained in their sets. In this class (X , ⊆) is an immediate successor of x chosen from the of subsets of a set X, X+ is a maximal element of many possible in the set M (x). The basic idea in X iff X+ is not contained in any other subset of X, the proof of the first of the three-parts is to express while X∞ is a maximum of X iff X∞ contains every the existence of a maximal element of a partially other subset of X. ordered set X in terms of the existence of a fixed Let A := {Aα ∈ X }α∈D be a non-empty sub- point in the set, which follows as a contradiction class of (X , ⊆), and suppose that both Aα and of the assumed hypothesis that every point in X Aα are elements of X . Since each Aα is ⊆-less has an immediate successor. Our basic application than Aα , it follows that Aα is an upper bound of immediate successors in the following will be to of A; this is also the smallest of all such bounds classes X ⊆ (P(X), ⊆) of subsets of a set X or- because if U is any other upper bound then every dered by inclusion. In this case for any A ∈ X , the Aα must precede U by Eq. (42) and therefore so function g can be taken to be the superset must Aα (because the union of a class of subsets g(A) = A fC (G(A) − A) , of a set is the smallest that contain each member (46) of the class: Aα ⊆ U ⇒ Aα ⊆ U for subsets where G(A) = {x ∈ X − A : A {x} ∈ X } (Aα ) and U of X). Analogously, since Aα is ⊆- of A. Repeated application of g to A then generates less than each Aα it is a lower bound of A; that it a principal filter, and hence an associated sequence, is the greatest of all the lower bounds L in X fol- based at A. lows because the intersection of a class of subsets is the largest that is contained in each of the subsets: Theorem 4.1. Let (X, ) be a partially ordered set L ⊆ Aα ⇒ L ⊆ Aα for subsets L and (Aα ) of X. that satisfies Hence the supremum and infimum of A in (X , ⊆) (ST1) There is a smallest element x0 of X which given by has no immediate predecessor in X. A← = sup(X ,⊆) A = A (ST2) If C ⊆ X is a totally ordered subset in X, A∈A then c∗ = supX C is in X. (45) Then there exists a maximal element x + of X which and →A = inf (X ,⊆) A = A has no immediate successor in X. A∈A are both elements of (X , ⊆). Intuitively, an upper Proof. Let T ⊆ (X, ) be a subset of X. If the con- (respectively, lower) bound of A in X is any subset clusion of the theorem is false then the alternative of X that contains (respectively, is contained in) (ST3) Every element x ∈ T has an immediate suc- every member of A. cessor g(x) in T 22 22 This makes T , and hence X, inductively defined infinite sets. It should be realized that (ST3) does not mean that every member of T is obtained from g, but only ensures that the immediate successor of any element of T is also in T. The infimum → T of these towers satisfies the additional property of being totally ordered (and is therefore essentially a sequence or net) in (X, ) to which (ST2) can be applied.
  • 34. 3180 A. Sengupta leads, as shown below, to a contradiction that can g(c) t then g(c) g(t); this combined with be resolved only by the conclusion of the theo- (c = t) ⇒ (g(c) = g(t)) yields g(c) g(t). On the rem. A subset T of (X, ) satisfying conditions other hand, t c for every t ∈ Cg requires g(t) c (ST1)−(ST3) is sometimes known as an g-tower as otherwise (t c) ⇒ (c g(t)) would, from the or an g-sequence: an obvious example of a tower resulting consequence t c g(t), contradict the is (X, ) itself. If assumed hypothesis that g(t) is the immediate suc- cessor of t. Hence, Cg is a g-tower in X. →T = {T ∈ T : T is an x0 − tower} To complete the proof that g(c) ∈ CT , and is the (P(X), ⊆)-infimum of the class T of all se- thereby the argument that CT is a tower, we first quential towers of (X, ), we show that this small- note that as → T is the smallest tower and Cg is built est sequential tower is infact a sequential totally from it, Cg =→ T must infact be → T itself. From ordered chain in (X, ) built from x0 by the g- Eq. (48) therefore, for every t ∈→ T either t g(c) function. Let the subset or g(c) t, so that g(c) ∈ CT whenever c ∈ CT . This concludes the proof that CT is actually the CT = {c ∈ X : (∀t ∈→ T )(t c∨c t)} ⊆ X (47) tower → T in X. From (ST2), the implication of the chain C T of X be an g-chain in → T in the sense that [cf. Eq. (37)] it is that subset of X each of whose CT =→ T = C g (49) elements is comparable with some element of → T . being the minimal tower → T is that the supre- The conditions (ST1)–(ST3) for CT can be verified mum t← of the totally ordered → T in its own as follows to demonstrate that CT is an g-tower. tower (as distinct from in the tower X: recall that (1) x0 ∈ CT , because it is less than each x ∈ → T . → T is a subset of X) must be contained in itself, (2) Let c← = supX CT be the supremum of the that is chain CT in X so that by (ST2), c← ∈ X. Let sup(CT ) = t← ∈→ T ⊆ X . (50) t ∈ → T . If there is some c ∈ CT such that t c, CT then surely t c← . Else, c t for every c ∈ CT This however leads to the contradiction from shows that c← t because c← is the small- (ST3) that g(t← ) be an element of → T , unless of est of all the upper bounds t of CT . Therefore course c← ∈ C T . (3) In order to show that g(c) ∈ C whenever c ∈ C g(t← ) = t← , (51) it needs to be verified that for all t ∈ → T , ei- which because of (49) may also be expressed equiv- ther t c ⇒ t g(c) or c t ⇒ g(c) t. alently as g(c← ) = c← ∈ CT . As the sequential As the former is clearly obvious, we investigate totally ordered set → T is a subset of X, Eq. (48) the latter as follows; note that g(t) ∈ → T by implies that t← is a maximal element of X which (ST3). The first step is to show that the subset allows (ST3) to be replaced by the remarkable in- verse criterion that Cg = {t ∈ →T : (∀c ∈ CT )(t c ∨ g(c) t)} (ST3 ) If x ∈ X and w precedes x, w x, (48) then w ∈ X, that is obviously false for a general of → T , which is a chain in X (observe the in- tower T . In fact, it follows directly from Eq. (39) verse roles of t and c here as compared to that in that under (ST3 ) any x+ ∈ X is a maximal ele- Eq. (47)), is a tower: Let t← be the supremum ment of X iff it is a fixed point of g as given by of Cg and take c ∈ C. If there is some t ∈ Cg Eq. (51). This proves the theorem and also demon- for which g(c) t, then clearly g(c) t ← . Else, strates how, starting from a minimum element of a t x for each t ∈ Cg shows that t← c be- partially ordered set X, (ST3) can be used to gen- cause t← is the smallest of all the upper bounds erate inductively a totally ordered sequential subset c of Cg . Hence t← ∈ Cg . of X leading to a maximal x+ = c← ∈ (X, ) that is a fixed point of the generating function g when- Property (ST3) for Cg follows from a small ever the supremum t← of the chain → T is in X. yet significant modification of the above arguments in which the immediate successors g(t) of t ∈ C g formally replaces the supremum t← of Cg . Thus Remark. The proof of this theorem, despite its ap- given a c ∈ C, if there is some t ∈ Cg for which parent length and technically involved character,
  • 35. Toward a Theory of Chaos 3181 carries the highly significant underlying message be the set of all the totally ordered subsets of that (X, ). Since X is a collection of (sub)sets of X, we order it by the inclusion relation on X and use Any inductive sequential g-construction of the tower Theorem to demonstrate that (X , ⊆) has an infinite chained tower CT starting with a maximal element C← , which by the definition of a smallest element x0 ∈ (X, ) such that X , is the required maximal chain in (X, ). a supremum c← of the g-generated sequen- Let C be a chain in X of the chains in (X, ). tial chain CT in its own tower is contained In order to apply the tower Theorem to (X , ⊆) we in itself, must necessarily terminate with a need to verify hypothesis (ST2) that the smallest fixed point relation of the type (51) with re- spect to the supremum. Note from Eqs. (50) C∗ = sup C = C (53) X and (51) that the role of (ST2) applied to C∈C a fully ordered tower is the identification of of the possible upper bounds of C [see Eq. (45)] is the maximal of the tower — which depends a chain of (X, ). Indeed, if x1 , x2 ∈ X are two only on the tower and has nothing to do points of Csup with x1 ∈ C1 and x2 ∈ C2 , then with anything outside it — with its supre- from the ⊆-comparability of C1 and C2 we may mum that depends both on the tower and its choose x1 , x2 ∈ C1 ⊇ C2 , say. Thus x1 and x2 are complement. -comparable as C1 is a chain in (X, ); C∗ ∈ X is therefore a chain in (X, ) which establishes that Thus although purely set-theoretic in nature, the the supremum of a chain of (X , ⊆) is a chain in filter-base associated with a sequentially totally or- (X, ). dered set may be interpreted to lead to the usual The tower Theorem 4.1 can now be applied to notions of adherence and convergence of filters and (X , ⊆) with C0 as its smallest element to construct thereby of a generated topology for (X, ), see a g-sequentially towered fully ordered subset of X Appendix A.1 and Example A.1.3. This very sig- consisting of chains in X nificant apparent inter-relation between topologies, filters and orderings will form the basis of our CT = {Ci ∈ P(X) : Ci ⊆ Cj for i ≤ j ∈ N} approach to the condition of maximal ill-posedness = → T ⊆ P(X) for chaos. In the second stage of the three-stage of (X , ⊆) — consisting of the common elements of programme leading to Zorn’s lemma, the tower all g-sequential towers T ∈ T of (X , ⊆) — that in- Theorem 4.1 and the comments of the preceding fact is a principal filter base of chained subsets of paragraph are applied at a higher level to a very (X, ) at C0 . The supremum (chain in X) C← of CT special class of the power set of a set, the class of in CT must now satisfy, by Theorem 4.1, the fixed all the chains of a partially ordered set, to directly point g-chain of X lead to the physically significant sup(CT ) = C← = g(C← ) ∈ CT ⊆ P(X) , CT Theorem 4.2(Hausdorff Maximal Principle). where the chain g(C) = C fC (G(C) − C) with Every partially ordered set (X, ) has a maximal G(C) = {x ∈ X − C : C {x} ∈ X }, is an im- totally ordered subset.23 mediate successor of C obtained by choosing one Proof. Here the base level is point x = fC (G(C) − C) from the many possible in G(C) − C such that the resulting g(C) = C {x} X = {C ∈ P(X) : C is a chain in (X, )} ⊆ P(X) is a strict successor of the chain C with no others (52) lying between it and C. Note that C← ∈ (X , ⊆) is 23 Recall that this means that if there is a totally ordered chain C in (X, ) that succeeds C+ , then C must be C+ so that no chain in X can be strictly larger than C+ . The notation adopted here and below is the following: If X = {x, y} is a non-empty set, then X := P(X) = {A : A ⊆ X} = {∅, {x}, {y}, {x, y}} is the set of subsets of X, and X := P 2 (X) = {A : A ⊆ X }, the set of all subsets of X , consists of the 16 elements ∅, {∅}, {{x}}, {{y}}, {{x, y}}, {{∅}, {x}}, {{∅}, {y}}, {{∅}, {x, y}}, {{x}, {y}}, {{x}, {x, y}}, {{y}, {x, y}}, {{∅}, {x}, {y}}, {{∅}, {x}, {x, y}}, {{∅}, {y}, {x, y}}, {{x}, {y}, {x, y}}, and X : an element of P 2 (X) is a subset of P(X), any element of which is a subset of X. Thus if C = {0, 1, 2} is a chain in (X = {0, 1, 2}, ≤), then C = {{0}, {0, 1}, {0, 1, 2}} ⊆ P(X) and C = {{{0}}, {{0}, {0, 1}}, {{0}, {0, 1}, {0, 1, 2}}} ⊆ P 2 (X) represent chains in (P(X), ⊆) and (P 2 (X), ⊆), respectively.
  • 36. 3182 A. Sengupta (X, ) X = {C ⊆ X : C is a chain in (X, )} Tower Theorem 4.1 CT = {T ⊆ (X , ⊆) : T is a C0 − tower} supC (CT ) = C← = g(C← ) ∈ CT ⊆ (X , ⊆) T Hausdorff Maximal Chain Theorem Zorn Lemma (u ∈ X c) (∀c ∈ (C← , )) Fig. 10. Application of Zorn’s Lemma to (X, ). Starting with a partially ordered set (X, ), construct: (a) The one-level higher subset X = {C ∈ P(X) : C is a chain in (X, )} of P(X) consisting of all the totally ordered subsets of (X, ), (b) The smallest common g-sequential totally ordered towered chain CT = {Ci ∈ P(X) : Ci ⊆ Cj for i ≤ j} ⊆ P(X) of all sequential g-towers of X by Theorem 4.1, which in fact is a principal filter base of totally ordered subsets of (X, ) at the smallest element C0 . (c) Apply Hausdorff Maximal Principle to (X , ⊆) to get the subset supCT (CT ) = C← = g(C← ) ∈ CT ⊆ P(X) of (X, ) as the supremum of (X , ⊆) in CT . The identification of this supremum as a maximal element of (X , ⊆) is a consequence of (ST2) and Eqs. (50), (51) that actually puts the supremum into X itself. By returning to the original level (X, ) (d) Zorn’s Lemma finally yields the required maximal element u ∈ X as an upper bound of the maximal totally ordered subset (C← , ) of (X, ). The dashed segment denotes the higher Hausdorff (X , ⊆) level leading to the base (X, ) Zorn level. only one of the many maximal fully ordered subsets Indeed, if there is an element v ∈ X that is compa- possible in (X, ). rable to u and v u, then v cannot be in C ← as it is necessary for every x ∈ C← to satisfy x u. Clearly With the assurance of the existence of a max- then C← {v} is a chain in (X, ) bigger than C← imal chain C← among all fully ordered subsets of which contradicts the assumed maximality of C ← a partially ordered set (X, ), the arguments are among the chains of X. completed by returning to the basic level of X. 1 The sequence of steps leading to Zorn’s Lemma, Theorem 4.3 (Zorn’s Lemma). Let (X, ) be a and thence to the maximal of a partially ordered set, partially ordered set such that every totally ordered is summarized in Fig. 10. subset of X has an upper bound in X. Then X The three examples below of the application of has at least one maximal element with respect to Zorn’s Lemma clearly reflect the increasing com- its order. plexity of the problem considered, with the maxi- Proof. The proof of this final part is a mere ap- mals a point, a subset, and a set of subsets of X, plication of the Hausdorff Maximal Principle on so that these are elements of X, P(X) and P 2 (X), the existence of a maximal chain C← in X to the respectively. hypothesis of this theorem that C← has an upper Example 4.2 bound u in X that quickly leads to the identifica- tion of this bound as a maximal element x + of X. (1) Let X = ({a, b, c}, ) be a three-point base-
  • 37. Toward a Theory of Chaos 3183 £ ¢  ¡ ¢  ! ¦ §    ¨  ¤ ¢  ©   ! §% ! ¤ ¡ $ !   ¥ ¦ # (a) (b) Fig. 11. Tree diagrams of two partially ordered sets where two points are connected by a line iff they are comparable to each other, with the solid lines linking immediate neighbors and the dashed, dotted and dashed–dotted lines denoting second, third and fourth generation orderings according to the principle of transitivity of the order relation. There are 8 × 2 chains of (a) and 7 chains of (b) starting from respective smallest elements with the immediate successor chains shown in solid lines. The 17 point set X = {0, 1, 2, . . . , 15, 16} in (a) has two maximals but no maximum, while in (b) there is a single maximum of P({a, b, c}), and three maximals without any maximum for P({a, b, c}) − {a, b, c}. In (a), let A = {1, 3, 4, 7, 9, 10, 15}, B = {1, 3, 4, 6, 7, 13, 15}, C = {1, 3, 4, 10, 11, 16} and D = {1, 3, 4}. The upper bounds of D in A are 7, 10 and 15 without any supremum (as there is no smallest element of {7, 10, 15}), and the upper bounds of D in B are 7 and 15 with sup B (D) = 7, while supC (D) = 10. Finally the maximal, maximum and the supremum in A of {1, 3, 4, 7} are all the same illustrating how the supremum of a set can belong to itself. Observe how the supremum and upper bound of a set are with reference to its complement in contrast with the maximum and maximal that have nothing to do with anything outside the set. level ground set ordered lexicographically, that is of X , with a b c. A chain C of the partially ordered sup(CT ) = C← = {a, b, c} = g(C← ) ∈ CT ⊆ P(X) Hausdorff-level set X consisting of subsets of X CT given by Eq. (52) is, for example, {{a}, {a, b}} and the six g-sequential chained towers the only maximal element of P(X). Zorn’s Lemma now assures the existence of a maximal element of C1 = {∅, {a}, {a, b}, {a, b, c}} , c ∈ X. Observe how the maximal element of (X, ) C2 = {∅, {a}, {a, c}, {a, b, c}} is obtained by going one level higher to X at the Hausdorff stage and returning to the base level X C3 = {∅, {b}, {a, b}, {a, b, c}} , at Zorn, see Fig. 10 for a schematic summary of this C4 = {∅, {b}, {b, c}, {a, b, c}} sequence of steps. C5 = {∅, {c}, {a, c}, {a, b, c}} , (2) Basis of a vector space. A linearly independent C6 = {∅, {c}, {b, c}, {a, b, c}} set of vectors in a vector space X that spans the space is known as the Hamel basis of X. To prove built from the smallest element ∅ corresponding to the existence of a Hamel basis in a vector space, the six distinct ways of reaching {a, b, c} from ∅ Zorn’s lemma is invoked as follows. along the sides of the cube marked on the figure The ground base level of the linearly indepen- with solid lines, all belong to X ; see Fig. 11(b). dent subsets of X An example of a tower in (X , ⊆) which is not a chain is T = {∅, {a}, {b}, {c}, {a, b}, {a, c}, X = {{xij }J ∈ P(X) : Span({xij }J ) j=1 j=1 {b, c}, {a, b, c}}. Hence the common infimum tow- = 0 ⇒ (αj )J = 0 ∀ J ≥ 1} ⊆ P(X) , j=1 ered chained subset with Span({xij }J ) := J αj xij , is such that no j=1 j=1 CT = {∅, {a, b, c}} =→ T ⊆ P(X) x ∈ X can be expressed as a linear combination of
  • 38. 3184 A. Sengupta the elements of X − {x}. X clearly has a smallest Compared to this purely algebraic concept of element, say {xi1 }, for some non-zero xi1 ∈ X. Let basis in a vector space, is the Schauder basis in the higher Hausdorff level a normed space which combines topological struc- ture with the linear in the form of convergence: If X = {C ∈ P 2 (X) : C is a chain in (X , ⊆)} ⊆ P 2 (X) a normed vector space contains a sequence (e i )i∈Z+ and collection of the chains with the property that for every x ∈ X there is an unique sequence of scalars (αi )i∈Z+ such that the CiK = {{xi1 }, {xi1 , xi2 }, . . . , {xi1 , xi2 , . . . , xiK }} remainder x − (α1 e1 + α2 e2 + · · · + αI eI ) ap- ∈ P 2 (X) proaches 0 as I → ∞, then the collection (e i ) is of X comprising linearly independent subsets of X known as a Schauder basis for X. be g-built from the smallest {xi1 }. Any chain C of (3) Ultrafilter. Let X be a set. The set F S = X is bounded above by the union C∗ = supX C = {Sα ∈ P(X) : Sα Sβ = ∅, ∀α = β} ⊆ P(X) of all C∈C C which is a chain in X containing {x i1 }, nonempty subsets of X with finite intersection prop- thereby verifying (ST2) for X. Application of the erty is known as a filter subbase on X and F B = tower theorem to X implies that the element {B ⊆ X : B = i∈I⊂D Si }, for I ⊂ D a finite subset CT = {Ci1 , Ci2 , . . . , Cin , . . .} =→ T ⊆ P 2 (X) of a directed set D, is a filter-base on X associated with the subbase F S; cf. Appendix A.1. Then the in X of chains of X is a g-sequential fully ordered filter generated by F S consisting of every superset towered subset of (X, ⊆) consisting of the com- of the finite intersections B ∈F B of sets of F S is mon elements of all g-sequential towers of (X, ⊆), the smallest filter that contains the subbase F S and that in fact is a chained principal ultrafilter on base F B. For notational simplicity, we will denote (P(X), ⊆) generated by the filter-base {{{x i1 }}} at the subbase F S in the rest of this example simply {xi1 }, where by S. T = {Ci1 , Ci2 , . . . , Cjn , Cjn+1 , . . .} Consider the base-level ground set of all filter subbases on X for some n ∈ N is an example of non-chained g- tower whenever (Cjk )∞ is neither contained in nor k=n S = S ∈ P 2 (X) : R = ∅ for every finite subset of S contains any member of the (Cik )∞ chain. Haus- k=1 ∅=R⊆S dorff’s chain theorem now yields the fixed-point g- ⊆ P 2 (X), chain C← ∈ X of X sup(CT ) =C← = {{xi1 }, {xi1 , xi2 }, {xi1 , xi2 , xi3 }, . . .} ordered by inclusion in the sense that S α ⊆ Sβ for CT all α β ∈ D, and let the higher Hausdorff-level =g(C← ) ∈ CT ⊆ P 2 (X) ˜ X = {C ∈ P 3 (X) : C is a chain in (S, ⊆)} ⊆ P 3 (X) as a maximal totally ordered principal filter on X comprising the collection of the totally ordered that is generated by the filter-base {{x i1 }} at xi1 , chains whose supremum B = {xi1 , xi2 , . . .} ∈ P(X) is, by Zorn’s lemma, a maximal element of the base level Cκ = {{Sα }, {Sα , Sβ }, . . . , {Sα , Sβ , . . . , Sκ }} X . This maximal linearly independent subset of X ∈ P 3 (X) is the required Hamel basis for X: Indeed, if the of S be g-built from the smallest {Sα } then an ultra- span of B is not the whole of X, then Span(B) x, filter on X is a maximal member S+ of (S, ⊆) in the with x ∈ Span(B) would, by definition, be a linearly / usual sense that any subbase S on X must necessar- independent set of X strictly larger than B, con- ily be contained in S+ so that S+ ⊆ S ⇒ S = S+ tradicting the assumed maximality of the later. It for any S ⊆ P(X) with FIP. The tower theorem needs to be understood that since the infinite basis now implies that the element cannot be classified as being linearly independent, we have here an important example of the supre- ˜ ˜ CT = {Cα , Cβ , . . . , Cν , . . .} = → T ⊆ P 3 (X) mum of the maximal chained set not belonging to the set even though this criterion was explicitly used ˜ of P 4 (X), which is a chain in X of the chains of S, in the construction process according to (ST2) and is a g-sequential fully ordered towered subset of the (ST3). ˜ common elements of all sequential towers of ( X, ⊆)
  • 39. Toward a Theory of Chaos 3185 that is a chained principal ultrafilter on (P 2 (X), ⊆) element of (X, ). This sequence is now applied, as generated by the filter-base {{{Sα }}} at {Sα }, where in Example 4.2(1), to the set of arbitrary relations ˜ Multi(X) on an infinite set X in order to formulate T = {Cα , Cβ , . . . , Cσ , Cς , . . .}, our definition of chaos that follows. is an obvious example of non-chained g-tower when- Let f be a noninjective map in Multi(X) and ever (Cσ ) is neither contained in, nor contains, any P (f ) the number of injective branches of f . Denote member of the Cα -chain. Hausdorff’s chain theorem by now yields the fixed-point C˜ ∈ X ← ˜ F = {f ∈ Multi(X) : f is a noninjective function sup(CT ) = C˜ = {{Sα }, {Sα , Sβ }, {Sα , Sβ , Sγ }, . . .} ˜ ← ˜ CT on X} ⊆ Multi(X) = g(C˜ ) ∈ CT ⊆ P 3 (X) ← ˜ the resulting basic collection of noninjective func- as a maximal totally ordered g-chained towered sub- tions in Multi(X). set of X that is, by Zorn’s lemma, a maximal ele- ment of the base level subset S of P 2 (X). C˜ is (i) For every α in some directed set D, let F have ← a chained principal ultrafilter on (P(X), ⊆) gener- the extension property ated by the filter-base {{Sα }} at Sα , while S+ = (∀fα ∈ F )(∃fβ ∈ F ) : P (fα ) ≤ P (fβ ) {Sα , Sβ , Sγ , . . .} ∈ P 2 (X) is an (non-principal) ul- trafilter on X — characterized by the property that (ii) Let a partial order on Multi(X) be defined, any collection of subsets on X with FIP (that is any for fα , fβ ∈ Map(X) ⊆ Multi(X) by filter subbase on X) must be contained in the max- P (fα ) ≤ P (fβ ) ⇔ fα fβ , (54) imal set S+ having FIP — that is not a principal filter unless Sα is a singleton set {xα }. with P (f ) := 1 for the smallest f , define a par- tially ordered subset (F, ) of Multi(X). This What emerges from these applications of Zorn’s is actually a preorder on Multi(X) in which Lemma is the remarkable fact that infinities (the functions with the same number of injective dot-dot-dots) can be formally introduced as “limit- branches are equivalent to each other. ing cases” of finite systems in a purely set-theoretic (iii) Let context without the need for topologies, metrics or convergences. The significance of this observation Cν = {fα ∈ Multi(X) : fα fν } ∈ P(F ) , will become clear from our discussions on filters and ν ∈ D, topology leading to Sec. 4.2 below. Also, the obser- be g-chains of non-injective functions of vation on the successive iterates of the power sets Multi(X) and P(X) in the examples above was to suggest their anticipated role in the complex evolution of a dy- X = {C ∈ P(F ) : C is a chain in (F, )} ⊆ P(F ) namical system that is expected to play a significant part in our future interpretation and understanding denote the corresponding Hausdorff level of all of this adaptive and self-organizing phenomenon of chains of F , with nature. CT = {Cα , Cβ , . . . , Cν , . . .} =→ T ⊆ P(F ) End Tutorial 5 being a g-sequential in X . By Hausdorff Max- imal Principle, there is a maximal fixed-point g-towered chain C← ∈ X of F From the examples in Tutorial 5, it should be clear sup(CT ) = C← = {fα , fβ , . . .} that the sequential steps summarized in Fig. 10 are CT involved in an application of Zorn’s lemma to show = g(C← ) ∈ CT ⊆ P(F ). that a partially ordered set has a maximal element with respect to its order. Thus for a partially or- Zorn’s Lemma applied to this maximal chain yields dered set (X, ), form the set X of all chains C in its supremum as the maximal element of C ← , and X. If C+ is a maximal chain of X obtained by the thereby of F . It needs to be appreciated, as in the Hausdorff Maximal Principle from the chain C of case of the algebraic Hamel basis, that the exis- all chains of X, then its supremum u is a maximal tence of this maximal non-functional element was
  • 40. 3186 A. Sengupta obtained purely set theoretically as the “limit” of a [Devaney, 1989] and is also maximally non-injective; net of functions with increasing nonlinearity, with- the tent map is therefore chaotic on D + . In con- out resorting to any topological arguments. Because trast, the examples of Secs. 1 and 2 are not chaotic it is not a function, this supremum does not be- as the maps are not topologically transitive, al- long to the functional g-towered chain having it though the Liapunov exponents, as in the case of as a fixed point, and this maximal chain does not the tent map, are positive. Here the (f n ) are iden- possess a largest, or even a maximal, element, al- tified with the iterates of f, and the “fixed point” though it does have a supremum.24 The supremum as one through which graphs of all the functions on is a contribution of the inverse functional relations residual index subsets pass. When the set of points − (fα ) in the following sense. From Eq. (2), the net D+ is dense in [0, 1] and both D+ and [0, 1] − D+ = ∞ of increasingly non-injective functions of Eq. (54) [0, 1] − i=0 f −i (Per(f )) (where Per(f ) denotes the implies a corresponding net of increasingly multi- set of periodic points of f ) are totally disconnected, valued functions ordered inversely by the inverse it is expected that at any point on this complement − − relation fα fβ ⇔ fβ fα . Thus the inverse re- the behavior of the limit will be similar to that on lations which are as much an integral part of graph- D+ : these points are special as they tie up the iter- ical convergence as are the direct relations, have a ates on Per(f ) to yield the multifunctions. There- smallest element belonging to the multifunctional fore in any neighborhood U of a D+ -point, there class. Clearly, this smallest element as the required is an x0 at which the forward orbit {f i (x0 )}i≥0 is supremum of the increasingly non-injective tower chaotic in the sense that of functions defined by Eq. (54), serves to complete the significance of the tower by capping it with a (a) the sequence neither diverges nor does it con- “boundary” element that can be taken to bridge the verge in the image space of f to a periodic orbit classes of functional and non-functional relations of any period, and on X. (b) the Liapunov exponent is given by We are now ready to define a maximally ill- 1/n posed problem f (x) = y for x, y ∈ X in terms of a def df n (x0 ) maximally non-injective map f as follows. λ(x0 ) = lim ln n→∞ dx n−1 Definition 4.1 (Chaotic map). Let A be a non- 1 df (xi ) = lim ln , xi = f i (x0 ) , empty closed set of a compact Hausdorff space X. A n→∞ n dx i=0 function f ∈ Multi(X) equivalently the sequence of functions (fi ) is maximally non-injective or chaotic which is a measure of the average slope of an orbit on A with respect to the order relation (54) if at x0 or equivalently of the average loss of informa- tion of the position of a point after one iteration, is (a) for any fi on A there exists an fj on A satisfying positive. Thus an orbit with positive Liapunov expo- fi fj for every j i ∈ N. nent is chaotic if it is not asymptotic (that is neither (b) the set D+ consists of a countable collection of convergent nor adherent, having no convergent sub- isolated singletons. orbit in the sense of Appendix A.1) to an unstable Definition 4.2 (Maximally ill-posed problem). periodic orbit or to any other limit set on which the Let A be a non-empty closed set of a compact Haus- dynamics is simple. A basic example of a chaotic dorff space X and let f be a functional relation in orbit is that of an irrational in [0, 1] under the shift Multi(X). The problem f (x) = y is maximally ill- map and that of the chaotic set its closure, the full posed at y if f is chaotic on A. unit interval. Let f ∈ Map((X, U)) and suppose that A = As an example of the application of these def- {f j (x0 )}j∈N is a sequential set corresponding to the initions, on the dense set D+ , the tent map sat- orbit Orb(x0 ) = (f j (x0 ))j∈N , and let fRi (x0 ) = j isfies both the conditions of sensitive dependence j≥i f (x0 ) be the i-residual of the sequence on initial conditions and topological transitivity (f j (x0 ))j∈N , with F Bx0 = {fRi (x0 ) : Res(N) → X 24 A similar situation arises in the following more intuitive example. Although the subset A = {1/n} n∈Z+ of the interval I = [−1, 1] has no smallest or minimal elements, it does have the infimum 0. Likewise, although A is bounded below by any element of [−1, 0), it has no greatest lower bound in [−1, 0) (0, 1].
  • 41. Toward a Theory of Chaos 3187 for all i ∈ N} being the decreasingly nested filter- It is important that the difference in the dy- base associated with Orb(x0 ). The so-called ω-limit namical behavior of the system on D+ and its com- set of x0 given by plement be appreciated. At any fixed point x of f i def in D+ (or at its equivalent images in [x]) the dynam- ω(x0 ) = {x ∈ X : (∃nk ∈ N)(nk → ∞)(f nk (x0 ) → x)} ics eventually gets attached to the (equivalent) fixed = {x ∈ X : (∀N ∈ Nx )(∀fRi ∈F Bx0 ) point, and the sequence of iterates converges graph- (fRi (x0 ) N = ∅)} (55) ically in Multi(X) to x (or its equivalent points). When x ∈ D+ , however, the orbit A = {f i (x)}i∈N / is simply the adherence set adh(f j (x0 )) of the se- is chaotic in the sense that (f i (x)) is not asymp- quence (f j (x0 ))j∈N , see Eq. (A.39); hence Defini- totically periodic and not being attached to any tion A.1.11 of the filter-base associated with a se- particular point they wander about in the closed quence and Eqs. (A.16), (A.24), (A.31) and (A.34) chaotic set ω(x) = Der(A) containing A such that allow us to express ω(x0 ) more meaningfully as for any given point in the set, some subsequence of the chaotic orbit gets arbitrarily close to it. Such ω(x0 ) = Cl(fRi (x0 )) . (56) sequences do not converge anywhere but only fre- i∈N quent every point of Der(A). Thus although in the It is clear from the second of Eqs. 55) that for limit of progressively larger iterations there is com- a continuous f and any x ∈ X, x ∈ ω(x0 ) im- plete uncertainty of the outcome of an experiment plies f (x) ∈ ω(x0 ) so that the entire orbit of x conducted at either of these two categories of ini- lies in ω(x0 ) whenever x does imply that the ω- tial points, whereas on D+ this is due to a random limit set is positively invariant; it is also closed be- choice from a multifunctional set of equally prob- cause the adherent set is a closed set according to able outputs as dictated by the specific conditions Theorem A.1.3. Hence x0 ∈ ω(x0 ) ⇒ A ⊆ ω(x0 ) under which the experiment was conducted at that reduces the ω-limit set to the closure of A with- instant, on its complement the uncertainty is due out any isolated points, A ⊆ Der(A). In terms to the chaotic behavior of the functional iterates of Eq. (A.33) involving principal filters, Eq. (56) themselves. Nevertheless it must be clearly under- in this case may be expressed in the more trans- stood that this later behavior is entirely due to the parent form ω(x0 ) = Cl(F P({f j (x0 )}∞ )) where j=0 multifunctional limits at the D+ points which com- the principal filter F P({f j (x0 )}∞ ) at A consists j=0 pletely determine the behavior of the system on its of all supersets of A = {f j (x0 )}∞ , and ω(x0 ) rep- complement. As an explicit illustration of this sit- j=0 resents the adherence set of the principal filter at uation, recall that for the shift map 2x mod(1) the A, see the discussion following Theorem A.1.3. If D+ points are the rationals on [0, 1], and any ir- A represents a chaotic orbit under this condition, rational is represented by a non-terminating and then ω(x0 ) is sometimes known as a chaotic set non-repeating decimal so that almost all decimals [Alligood et al., 1997]; thus the chaotic orbit in- in [0, 1] in any base contain all possible sequences finitely often visits every member of its chaotic of any number of digits. For the logistic map, the set25 which is simply the ω-limit set of a chaotic situation is more complex, however. Here the on- orbit that is itself contained in its own limit set. set of chaos marking the end of the period dou- Clearly the chaotic set is positive invariant, and bling sequence at λ∗ = 3.5699456 is signaled by the from Theorem A.1.3 and its corollary it is also com- disappearance of all stable fixed points, Fig. 13(c), pact. Furthermore, if all (sub)sequences emanating with Fig. 13(a) being a demonstration of the sta- from points x0 in some neighborhood of the set con- ble limits for λ = 3.569 that show up as conver- verge to it, then ω(x0 ) is called a chaotic attractor, gence of the iterates to constant valued functions see [Alligood et al., 1997]. As common examples of (rather than as constant valued inverse functions) chaotic sets that are not attractors mention may be at stable fixed points, shown more emphatically in made of the tent map with a peak value larger than Fig. 12(a). What actually happens at λ ∗ is shown in 1 at 0.5, and the logistic map with λ ≥ 4 again with Fig. 16(a) in the next subsection: the almost verti- a peak value at 0.5 exceeding 1. cal lines produced at a large, but finite, iterations i 25 How does this happen for A = {f i (x0 )}i∈N that is not the constant sequence (x0 ) at a fixed point? As i ∈ N increases, points are added to {x0 , f (x0 ), . . . , f I (x0 )} not, as would be the case in a normal sequence, as a piled-up Cauchy tail, but as points generally lying between those already present; recall a typical graph as of Fig. 9, for example.
  • 42. 3188 A. Sengupta 1 1 1 1 9 9 2 2 5 5 6 6 1 1 2 2 7 7 3 3 10 10 1 1 Stable 1-cycle, 1-cycle, λ = 2.95 Stable λ = 2.95 Stable 2-cycle, 2-cycle, λ = 3.4 Stable λ = 3.4 Graphical limit at limit at 9001 Graphical 9001 Graphical limit at limit at 9001-9002 Graphical 9001-9002 0 0 0.2 0.2 1 0 1 0 0.7 0.7 1 1 (a) (a) (a) (b) (b) (b) 1 1 1 1 9 9 5 5 1 1 2 2 6 6 10 10 8 8 Stable 4-cycle, λ = 3.5 = 3.5 Stable 4-cycle, λ Stable Stable 8-cycle, λ = 3.55 8-cycle, λ = 3.55 Graphical limit limit at 9001-9004 Graphical at 9001-9004 Graphical limit at 9001-9008 Graphical limit at 9001-9008 0 0 0.7 0.7 1 0 1 0 0.7 0.7 1 1 (c) (d) Fig. 12. Fixed points and cycles of logistic map. The isolated fixed point of (b) yields two non-fixed points to which the iterates converge simultaneously in the sense that the generated sequence converges to one iff it converges to the other. This suggests that nonlinear dynamics of a system can lead to a situation in which sequences in a Hausdorff space may converge to more than one point. Since convergence depends on the topology (Corollary to Theorem A.1.5), this may be interpreted to mean that nonlinearity tends to modify the basic structure of a space. The sequence of points generated by the iterates of the map are marked on the y-axis of (a)–(c) in italics. The singletons {x} are ω-limit sets of the respective fixed point x and is generated by the constant sequence (x, x, . . .). Whereas in (a) this is the limit of every point in (0, 1), in the other cases these fixed points are isolated in the sense of Definition 2.3. The isolated points, however, give rise to sequences that converge to more than one point in the form of limit cycles as shown in (b)–(d). 1 1 (the multifunctions are generated only in the limit- chaos therefore, λx(1 − x) is chaotic for the values ing sense of i → ∞ and represent a boundary be- of λ λ∗ that are shown in Fig. 16. We return to tween functional and non-functional relations on a this case in the following subsection. set), decrease in magnitude with increasing itera- As an example of chaos in a noniterative sys- tions until they reduce to points. This gives rise to tem, we investigate the following question: While a (totally disconnected) Cantor set on the y-axis in maximality of non-injectiveness produced by an in- contrast with the connected intervals that the mul- creasing number of injective branches is necessary tifunctional limits at λ λ∗ of Figs. 16(b)–16(d) for a family of functions to be chaotic, is this also produce. By our characterization Definition 4.1 of sufficient for the system to be chaotic? This is an
  • 43. Toward a Theory of Chaos 3189 .507 .507 1 1 .5 .5 Iterate 9000 of 9000 of logistic map Iterate logistic map 9000 iterations on logistic logistic map 9000 iterations on map .473 .473 Order at λ = 3.569= 3.569 .488 Order at λ 0 0.1 0 0.1 .488 at λ = 3.569= 3.569 at λ 1 1 (a) (a) (b) (b) (a) (b) .511 .511 1 1 .493 Iterate 9000 of logistic map Iterate 9000 of logistic map 9000 iterations on logistic map 9000 iterations on logistic map .493 .472 .472 0.1 at ‘‘Edge of”Edge of chaos” ∗ λ = λ∗ 0 .487 0 0.1 λ∗ = 3.5699456 chaos” at λ = λ at .487 at λ∗ = 3.5699456 1 1 (c) (c) (c) (d) (d) (d) .511 .511 1 1 1 1 .493 Iterate 9000 of logistic map .493Iterate 9000 of logistic map 9000 iterationsiterations on logistic map 9000 on logistic map .472 .472 Chaos at Chaos at λ = 3.57 .487 0 .487 0 0.1 at λ = 3.57 λ = 3.57 λ = 3.57 0.1 at 1 1 (e) (e) (f) (f) (e) (f) Fig. 13. Multifunctional and cobweb plots of λx(1 − x). Comparison of the graphs for the three values of λ shown in (a)–(f) illustrates how the dramatic changes in the character of the former are conspicuously absent in the conventional plots that display no perceptible distinction between the three cases. 1 1
  • 44. 3190 A. Sengupta important question especially in the context of a functions (f i )i∈N which may be verified by reference non-iterative family of functions where fixed points to Definition A.1.8, Theorem A.1.3 and the proofs are no longer relevant. of Theorems A.1.4 and A.1.5, together with the di- Consider the sequence of functions rected set Eq. (A.10) with direction (A.11). The | sin(πnx)|∞ . The graphs of the subsequence n=1 basin of attraction of the attractor is A 1 because | sin(2n−1 πx)| and of the sequence (tn (x)) on [0, 1] the graphical limit (D+ , F (D+ )) (G(R+ ), R+ ) of are qualitatively similar in that they both contain Definition 3.1 may be obtained, as indicated above, 2n−1 of their functional graphs each on a base of by a proper choice of sequences associated with 1/2n−1 . Thus both | sin(2n−1 πx)|∞ and (tn (x))∞ n=1 n=1 A. Note that in the context of iterations of func- converge graphically to the multifunction [0,1] on tions, the graphical limit (D+ , y0 ) of the sequence the same set of points equivalent to 0. This is suf- (f n (x)) denotes a stable fixed point x∗ with im- ficient for us to conclude that | sin(2 n−1 πx)|∞ ,n=1 age x∗ = f (x∗ ) = y0 to which iterations start- and hence | sin(πnx)|∞ , is chaotic on the infinite n=1 ing at any point x ∈ D+ converge. The graphi- equivalent set [0]. While Fig. 9 was a comparison cal limits (xi0 , R+ ) are generated with respect to of the first four iterates of the tent and absolute the class {xi∗ } of points satisfying f (xi0 ) = xi∗ , sine maps, Fig. 14 shows the “converged” graphical i = 0, 1, 2, . . . equivalent to unstable fixed point limits after 17 iterations. x∗ := x0∗ to which inverse iterations starting at any initial point in R+ must converge. Even though only 4.1. The chaotic attractor x∗ is inverse stable, an equivalent class of graph- One of the most fascinating characteristics of chaos ically converged limit multis is produced at every in dynamical systems is the appearance of attrac- member of the class xi∗ ∈ [x∗ ], resulting in the far- tors the dynamics on which are chaotic. For a subset reaching consequence that every member of the class A of a topological space (X, U) such that R(f (A)) is as significant as the parent fixed point x ∗ from is contained in A — in this section, unless otherwise which they were born in determining the dynam- stated to the contrary, f (A) will denote the graph ics of the evolving system. The point to remember and not the range (image) of f — which ensures about infinite intersections of a collection of sets that the iteration process can be carried out in A, having finite intersection property, as in Eq. (58), is let that this may very well be empty; recall, however, fRi (A) = f j (A) that in a compact space this is guaranteed not to be j≥i∈N so. In the general case, if core(A) = ∅ then A is the (57) principal filter at this core, and Atr(A 1 ) by Eqs. (58) = f j (x) and (A.33) is the closure of this core, which in this j≥i∈N x∈A case of topology being induced by the filterbase, is generate the filter-base F B with Ai := fRi (A) ∈F B just the core itself. A1 by its very definition, is a pos- being decreasingly nested, Ai+1 ⊆ Ai for all i ∈ N, itively invariant set as any sequence of graphs con- in accordance with Definition A.1.1. The existence verging to Atr(A1 ) must be eventually in A1 : the of a maximal chain with a corresponding maxi- entire sequence therefore lies in A 1 . Clearly, from mal element as asssured by the Hausdorff Maximal Theorem A.3.1 and its corollary, the attractor is a Principle and Zorn’s Lemma respectively implies a positively invariant compact set. A typical attrac- nonempty core of F B. As in Sec. 3 following Defi- tor is illustrated by the derived sets in the second nition 3.3, we now identify the filterbase with the column of Fig. 22 which also illustrates that the set neighborhood base at f ∞ which allows us to define of functional relations are open in Multi(X); specifi- def cally functional–non-functional correspondences are Atr(A1 ) = adh(F B) neutral-selfish related as in Fig. 22, 3–2, with the (58) = Cl(Ai ) attracting graphical limit of Eq. (58) forming the A i ∈F B boundary of (finitely) many-to-one functions and as the attractor of the set A1 , where the last equal- the one-to-(finitely) many multifunctions. ity follows from Eqs. (59) and (20) and the closure Equation (58) is to be compared with the im- is wth respect to the topology induced by the neigh- age definition of an attractor [Stuart Humphries, borhood filter base F B. Clearly the attractor as de- 1996] where f (A) denotes the range and not the fined here is the graphical limit of the sequence of graph of f . Then Eq. (58) can be used to define a
  • 45. Toward a Theory of Chaos 3191 1 1 1 1 0 017th iterate of tent map map 0 .0008 0 Graph of | sin(216 πx)| 16 πx)| 17th iterate of tent .0008 Graph of | sin(2 .0008 .0008 (a) (a) (a) (b) (b) (b) Fig. 14. Similarity in the behavior of the graphs of (a) tent and (b) | sin(216 πx)| maps at 17 iterations demonstrate chaoticity of the latter. sequence of points xk ∈ Ank and hence the subset be identified with the subset R+ on the y-axis on def which the multifunctional limits G : R + → X of ω(A) = {x ∈ X : (∃nk ∈ N)(nk → ∞)(∃xk ∈ Ank ) graphical convergence are generated, with its basin (f nk (xk ) → x)} of attraction being contained in the D + associated with the injective branch of f that generates R + . In = {x ∈ X : (∀N ∈ Nx )(∀Ai ∈ A) summary it may be concluded that since definitions (N Ai = ∅)} (59) (59) and (61) involve both the domain and range of f , a description of the attractor in terms of the as the corresponding attractor of A that satisfies an graph of f , like that of Eq. (58), is more pertinent equation formally similar to (58) with the difference and meaningful as it combines the requirements of that the filter-base A is now in terms of the image both these equations. Thus, for example, as ω(A) is f (A) of A, which allows the adherence expression not the function G(R+ ), this attractor does not in- to take the particularly simple form clude the equivalence class of inverse stable points ω(A) = Cl(f i (A)) . (60) that may be associated with x∗ , see for example i∈N Fig. 15. From Eq. (59), we may make the particularly The complimentary subset excluded from this def- simple choice of (xk ) to satisfy f nk (x−k ) = x so inition of ω(A), as compared to Atr(A 1 ), that is −n that x−k = fB k (x), where x−k ∈ [x−k ] := f −nk (x) required to complete the formalism is given by is the element of the equivalence class of the inverse Eq. (61) below. Observe that the equation for ω(A) image of x corresponding to the injective branch f B . is essentially Eq. (A.15), even though we prefer to This choice is of special interest to us as it is the use the alternate form of Eq. (A.16) as this brings 1 1 class that generates the G-function on R + in graph- out more clearly the frequenting nature of the se- ical convergence. This allows us to express ω(A) as quence. The basin of attraction −n Bf (A) = {x ∈ A : ω(x) ⊆ Atr(A)} ω(A) = {x ∈ X : (∃nk ∈ N)(nk → ∞)(fB k (x) = {x ∈ A : (∃nk ∈ N)(nk → ∞) (61) = x−k converges in (X, U))} ; (62) (f nk (x) → x∗ ∈ ω(A)) note that the x−k of this equation and the xk of of the attractor is the smallest subset of X in which Eq. (59) are, in general, quite different points. sequences generated by f must eventually lie in or- A simple illustrative example of the construc- der to adhere at ω(A). Comparison of Eqs. (62) tion of ω(A) for the positive injective branch of with (33) and (61) with (32) show that ω(A) can the homeomorphism (4x2 − 1)/3, −1 ≤ x ≤ 1, is
  • 46. 3192 A. Sengupta 1 4x2 − 1 f= 3 0.8 x−1 fB 0.6 x 0.4 f2 B 0.2 3 fB 1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 5 4 3 2 −0.25 4 3 2 1 −0.4 x1 x2 x−1 x−3 Fig. 15. The attractor for f (x) = (4x2 − 1)/3, for −1 ≤ x ≤ 1. The converging sequences are denoted by arrows on the right, and (xk ) are chosen according to the construction shown. This example demonstrates how although A ⊆ f (A), where A = [0, 1] is the domain of the positive injective branch of f , the succeeding images (f i (A))i≥1 satisfy the required restriction for iteration, and A in the discussion above can be taken to be f (A); this is permitted as only a finite number of iterates is thereby discarded. It is straightforward to verify that Atr(A1 ) = (−1, [−0.25, 1]) ((−1, 1), −0.25) (1, [−0.25, 1]) with F (x) = −0.25 on D− = (−1, 1) = D+ and G(y) = 1, and −1 on R− = [−0.25, 1] = R+ . By comparison, ω(A) from either its definition Eq. (59) or from the equivalent intersection expression Eq. (60), is simply the closed interval R + = [−0.25, 1]. The italicized iterate numbers on the graphs show how the oscillations die out with increasing iterations from x = ±1 and approach −0.25 in all neighborhoods of 0. shown in Fig. 15, where the arrow-heads denote the quirements of an attractor to lead to the concept of converging sequences f ni (xi ) → x and f ni −m (xi ) → a chaotic attractor to be that on which the dynam- x−m which proves invariance of ω(A) for a homeo- ics is chaotic in the sense of Definitions 4.1. and 4.2. morphic f ; here continuity of the function and its Hence inverse is explicitly required for invariance. Posi- tive invariance of a subset A of X implies that for Definition 4.3 (Chaotic Attractor). Let A be a any n ∈ N and x ∈ A, f n (x) = yn ∈ A, while positively invariant subset of X. The attractor negative invariance assures that for any y ∈ A, Atr(A) is chaotic on A if there is sensitive depen- f −n (y) = x−n ∈ A. Invariance of A in both the dence on initial conditions for all x ∈ A. The sensi- forward and backward directions therefore means tive dependence manifests itself as multifunctional that for any y ∈ A and n ∈ N, there exists a x ∈ A graphical limits for all x ∈ D+ and as chaotic orbits 1 such that f n (x) = y. In interpreting this figure, it when x ∈ D+ . may be useful to recall from Definition 4.1 that an increasing number of injective branches of f is a The picture of chaotic attractors that emerge necessary, but not sufficient, condition for the oc- from the foregoing discussions and our characteri- currence of chaos; thus in Figs. 12(a) and 15, in- zation of chaos of Definition 4.1 is that it it is a creasing noninjectivity of f leads to constant valued subset of X that is simultaneously “spiked” multi- limit functions over a connected D+ in a manner functional on the y-axis and consists of a dense col- similar to that associated with the classical Gibb’s lection of singleton domains of attraction on the x- phenomenon in the theory of Fourier series. axis. This is illustrated in Fig. 16 which shows some Graphical convergence of an increasingly non- typical chaotic attractors. The first four diagrams linear family of functions implied by its increasing (a)–(d) are for the logistic map with (b)–(d) show- non-injectivity may now be combined with the re- ing the 4-, 2- and 1-piece attractors for λ = 3.575,
  • 47. Toward a Theory of Chaos 3193 3.66, and 3.8, respectively that are in qualitative significant as λ = λ∗ marks the boundary between agreement with the standard bifurcation diagram the nonchaotic region for λ λ∗ and the chaotic for reproduced in (e). Figures 16(b)–16(d) have the ad- λ λ∗ (this is to be understood as being suitably vantage of clearly demonstrating how the attractors modified by the appearance of the nonchaotic win- are formed by considering the graphically converged dows for some specific intervals in λ λ ∗ ). At λ∗ limit as the object of study unlike in (e) which shows the generated fractal Cantor set Λ is an attractor the values of the 501–1001th iterates of x 0 = 1/2 as as it attracts almost every initial point x 0 so that a function of λ. The difference in (a) and (b) for a the successive images xn = f n (x0 ) converge toward change of λ from λ λ∗ = 3.5699456 to 3.575 is the Cantor set Λ. In (f) the chaotic attractors for 1 1 1 1 λ = 3.5699456 λ = 3.5699456 λ= λ = 3.575 3.575 Iterates = 2001 − 2004 2004 Iterates = 2001 − Iterates = 2001 − 2004− 2004 Iterates = 2001 0 0 1 0 1 0 1 1 (a) (a) (b) ---- (a) (b) 1 1 1 1 λ λ = 3.66= 3.66 λ λ = 3.8 = 3.8 Iterates = 2001 − 2004 2004 Iterates = 2001 − Iterates = 2001 − 2002− 2002 Iterates = 2001 0 0 1 0 1 0 1 1 (c) (c) (d) (c) (d) Fig. 16. Chaotic attractors for different values of λ. For the logistic map the usual bifurcation (e) shows the chaotic attractors for λ λ∗ = 3.5699456, while (a)–(d) display the graphical limits for four values of λ chosen for the Cantor set and 4,- 2-, and 1-piece attractors, respectively. In (f) the attractor [0, 1] (where the dotted lines represent odd iterates and the solid lines even iterates of f ) disappear if f is reflected about the x-axis. The function ff (x) is given by 2(1 + x)/3 0 ≤ x 1/2 ff (x) = 2(1 − x) 1/2 ≤ x ≤ 1 . 1 1
  • 48. 3194 A. Sengupta 1 1 1 1 λ4 = 3.449 3.449 λ4 = 0 0 λ4 λ∗ λ4 λ∗ 4 0 4 0 First 12 iterates First 12 iterates 1 1 (e) (e) (e) (f) (f) (f) Fig. 16. (Continued ) the piecewise continuous function on [0, 1] on ordered sets, just as the role of the choice of an appropriate problem-dependent basis was high-  2(1 + x) , 0 ≤ x 1   lighted at the end of Sec. 2. Chaos as manifest in its 3 2  ff (x) = attractors is a direct consequence of the increasing   2(1 − x), 1 nonlinearity of the map with increasing iteration; ≤ x ≤ 1,  2 we reemphasize that this is only a necessary condi- is [0, 1] where the dotted lines represent odd iterates tion so that the increasing nonlinearities of Figs. 12 and the full lines even iterates of f ; here the attrac- and 15 eventually lead to stable states and not to tor disappears if the function is reflected about the chaotic instability. Under the right conditions as x-axis. enunciated following Fig. 10, chaos appears to be the natural outcome of the difference in the behav- 4.2. Why chaos? A preliminary ior of a function f and its inverse f − under their inquiry successive applications. Thus f = f f − f allows f The question as to why a natural system should to take advantage of its multi-inverse to generate evolve chaotically is both interesting and relevant, all possible equivalence classes that are available, a and this section attempts to advance a plausible an- feature not accessible to f − = f − f f − . As we have swer to this inquiry that is based on the connection seen in the foregoing, equivalence classes of fixed between topology and convergence contained in the points, stable and unstable, are of defining signif- Corollary to Theorem A.1.5. Open sets are group- icance in determining the ultimate behavior of an ings of elements that govern convergence of nets evolving dynamical system and as the eventual (as and filters, because the required property of being also frequent) charcter of a filter or net in a set either eventually or frequently in (open) neighbor- is dictated by open neighborhoods of points of the hoods of a point determines the eventual behavior set, it is postulated that chaoticity on a set X leads of the net; recall in this connection the unusual 1 1 con- to a reformulation of the open sets of X to equiv- vergence characteristics in cofinite and cocountable alence classes generated by the evolving map f , see spaces. Conversely for a given convergence charac- Example 2.4(3). Such a redefinition of open sets of teristic of a class of nets, it is possible to infer the equivalence clases allow the evolving system to tem- topology of the space that is responsible for this porally access an ever increasing number of states convergence, and it is this point of view that we even though the equivalent fixed points are not fixed adopt here to investigate the question of this sub- under iterations of f except for the parent of the section: recall that our Definitions 4.1 and 4.2 were class, and can be considered to be the governing based on purely algebraic set-theoretic arguments criterion for the cooperative or collective behavior
  • 49. Toward a Theory of Chaos 3195 of the system. The predominance of the role of f − to points x = y ∈ X then x ∼ y: x is of course in f = f f −f in generating the equivalence classes equivalent to itself while x, y, z are equivalent to (that is exploiting the many-to-one character) of f each other iff they are simultaneously in every open is reflected as limit multis for f (i.e. constant f − set in which the net may eventually belong. This on R+ ) in f − = f − f f − ; this interpretation of the hall-mark of chaos can be appreciated in terms of dynamics of chaos is meaningful as graphical con- a necessary obliteration of any separation property vergence leading to chaos is a result of pointwise bi- that the space might have originally possessed, see convergence of the sequence of iterates of the func- property (H3) in Appendix A.3. We reemphasize tions generated by f . But as f is a noninjective that a set in this chaotic context is required to act function on X possessing the property of increasing in a dual capacity depending on whether it carries nonlinearity in the form of increasing noninjectivity the initial or final topology under M. with iteration, various cycles of disjoint equivalence This preliminary inquiry into the nature of classes are generated under iteration, see for exam- chaos is concluded in the final section of this paper. ple Fig. 9(a) for the tent map. A reference to Fig. shows that the basic set XB , for a finite number n of 5. Graphical Convergence Works iterations of f , contains the parent of each of these open equivalent sets in the domain of f , with the We present in this section some real evidence in topology on XB being the corresponding p-images support of our hypothesis of graphical conver- of these disjoint saturated open sets of the domain. gence of functions in Multi(X, Y ). The example is In the limit of infinite iterations of f leading to the taken from neutron transport theory, and concerns multifunction M (this is the f ∞ of Sec. 4.1), the the discretized spectral approximation [Sengupta, generated open sets constitute a basis for a topol- 1988, 1995] of Case’s singular eigenfunction solu- tion of the monoenergetic neutron transport equa- ogy on D(f ) and the basis for the topology of R(f ) tion, [Case Zweifel, 1967]. The neutron transport are the corresponding M-images of these equivalent equation is a linear form of the Boltzmann equation classes. It is our contention that the motive force be- that is obtained as follows. Consider the neutron- hind evolution toward a chaos, as defined by Defini- moderator system as a mixture of two species of tion 4.1, is the drive toward a state of the dynamical gases each of which satisfies a Boltzmann equation system that supports ininality of the limit multi M; of the type see Appendix A.2 with the discussions on Fig. and Eq. (26) in Sec. 2. In the limit of infinite iterations ∂ + vi · fi (r, v, t) therefore, the open sets of the range R(f ) ⊆ X are ∂t the multi images that graphical convergence gener- ates at each of these inverse-stable fixed points. X = dv dv1 dv1 Wij (vi → v ; v1 → v1 ) therefore has two topologies imposed on it by the j dynamics of f : the first of equivalence classes gen- {fi (r, v , t)fj (r, v1 , t) − fi (r, v, t)fj (r, v1 , t)} erated by the limit multi M in the domain of f and where the second as M-images of these classes in the range of f . Quite clearly these two topologies need not be Wij (vi → v ; v1 → v1 ) = |v − v1 |σij (v − v , v1 − v1 ) the same; their intersection therefore can be defined to be the chaotic topology on X associated with the σij being the cross-section of interaction between chaotic map f on X. Neighborhoods of points in this species i and j. Denote neutrons by subscript 1 and topology cannot be arbitrarily small as they consist the background moderator with which the neutrons of all members of the equivalence class to which interact by 2, and make the assumptions that any element belongs; hence a sequence converging (i) The neutron density f1 is much less compared to any of these elements necessarily converges to with that of the moderator f2 so that the terms all of them, and the eventual objective of chaotic f1 f1 and f1 f2 may be neglected in the neutron and dynamics is to generate a topology in X with re- moderator equations, respectively. spect to which elements of the set can be grouped (ii) The moderator distribution f2 is not affected together in as large equivalence classes as possible by the neutrons. This decouples the neutron and in the sense that if a net converges simultaneously moderator equations and leads to an equilibrium
  • 50. 3196 A. Sengupta Maxwellian fM for the moderator while the neu- the continuous spectrum of µ. This distinction be- trons are described by the linear equation tween the nature of the inverses depending on the ∂ relative values of µ and ν suggests a wider “non- +v· f (r, v, t) function” space in which to look for the solutions of ∂t operator equations, and in keeping with the philos- = dv dv1 dv1 W12 (v → v ; v1 → v1 ) ophy embodied in Fig. of treating inverse prob- lems in the space of multifunctions, we consider {f (r, v , t)fM (v1 ) − f (r, v, t)fM (v1 )}) all Fν ∈ Multi(V (µ), R)) satisfying Eq. (63) to be eigenfunctions of µ for the corresponding eigenvalue This is now put in the standard form of the neutron ν, leading to the following multifunctional solution transport equation [Williams, 1971] of (63) 1 ∂ ˆ (V (µ), 0) if ν ∈ V (µ) / + Ω · v + S(E) Φ(r, E, Ω, t) Fν (µ) = v ∂t (V (µ) − ν, 0) (ν, R)) if ν ∈ V (µ) , = dΩ ˆ ˆ ˆ dE S(r, E → E; Ω · Ω)Φ(r, E , Ω , t). where V (µ) − ν is used as a shorthand for the inter- val V (µ) with ν deleted. Rewriting the eigenvalue ˆ where E = mv 2 /2 is the energy and Ω the direc- equation (63) as µν (Fν (µ)) = 0 and comparing this tion of motion of the neutrons. The steady state, with Fig. , allows us to draw the correspondences monoenergetic form of this equation is Eq. (A.53) f ⇔ µν 1 ∂Φ(x, µ) c X and Y ⇔ {Fν ∈ Multi(V (µ), R) : µ + Φ(x, µ) = Φ(x, µ )dµ , ∂x 2 −1 Fν ∈ D(µν )} 0 c 1, −1 ≤ µ ≤ 1 (64) f (X) ⇔ {0 : 0 ∈ Y } and its singular eigenfunction solution for x ∈ XB ⇔ {0 : 0 ∈ X} (−∞, ∞) is given by Eq. (A.56) f − ⇔ µ− . ν −x/ν0 Φ(x, µ) = a(ν0 )e φ(µ, ν0 ) Thus a multifunction in X is equivalent to 0 in X B x/ν0 + a(−ν0 )e φ(−ν0 , µ) under the linear map µν , and we show below that 1 this multifunction is in fact the Dirac delta “func- + a(ν)e−x/ν φ(µ, ν)dν ; tion” δν (µ), usually written as δ(µ − ν). This sug- −1 gests that in Multi(V (µ), R), every ν ∈ V (µ) is in see Appendix A.4 for an introductory review of the point spectrum of µ, so that discontinuous func- Case’s solution of the one-speed neutron transport tions that are pointwise limits of functions in func- equation. tion space can be replaced by graphically converged The term “eigenfunction” is motivated by the multifunctions in the space of multifunctions. Com- following considerations. Consider the eigenvalue pleting the equivalence class of 0 in Fig. , gives the equation multifunctional solution of Eq. (63). From a comparison of the definition of ill- (µ − ν)Fν (µ) = 0, µ ∈ V (µ), ν∈R (63) posedness (Sec. 2) and the spectrum (Table 1), it is in the space of multifunctions Multi(V (µ), (−∞, clear that Lλ (x) = y is ill-posed iff ∞)), where µ is in either of the intervals [−1, 1] (1) Lλ not injective ⇔ λ ∈ P σ(Lλ ), which corre- or [0, 1] depending on whether the given bound- sponds to the first row of Table 1. ary conditions for Eq. (A.53) is full-range or half (2) Lλ not surjective ⇔ the values of λ correspond range. If we are looking only for functional solu- to the second and third columns of Table 1. tions of Eq. (63), then the unique function F that (3) Lλ is bijective but not open ⇔ λ is either in satisfies this equation for all possible µ ∈ V (µ) and Cσ(Lλ ) or Rσ(Lλ ) corresponding to the second ν ∈ R − V (µ) is Fν (µ) = 0 which means, according row of Table 1. to Table 1, that the point spectrum of µ is empty and (µ − ν)−1 exists for all ν. When ν ∈ V (µ), how- We verify in the three steps below that X = ever, this inverse is not continuous and we show L1 [−1, 1] of integrable functions, ν ∈ V (µ) = [−1, 1] below that in Map(V (µ), 0), ν ∈ V (µ) belongs to belongs to the continuous spectrum of µ.
  • 51. Toward a Theory of Chaos 3197 Table 1. Spectrum of linear operator L ∈ Map(X). Here Lλ := L−λ satisfies the equation Lλ (x) = 0, with the resolvent set ρ(L) of L consisting of all those complex numbers λ for which L−1 exists as a continuous operator with dense λ domain. Any value of λ for which this is not true is in the spectrum σ(L) of L, that is further subdivided into three disjoint components of the point, continuous and residual spectra according to the criteria shown in the table. R(Lλ ) Lλ L−1 λ R=X Cl(R) = X Cl(R) = X Not injective ··· P σ(L) P σ(L) P σ(L) Not continuous Cσ(L) Cσ(L) Rσ(L) Injective Continuous ρ(L) ρ(L) Rσ(L) (a) R(µν ) is dense, but not equal to L1 . The set Nevertheless although the net of functions of functions g(µ) ∈ L1 such that µ−1 g ∈ L1 ν 1 cannot be the whole of L1 . Thus, for example, δνε (µ) = −1 (1 + ν)/ε + tan−1 (1 − ν)/ε tan the piecewise constant function g = const = 0 ε on |µ − ν| ≤ δ 0 and 0 otherwise is in L1 × , ε0 but not in R(µν ) as µ−1 g ∈ L1 . Nevertheless (µ − ν)2 + ε2 ν for any g ∈ L1 , we may choose the sequence of 1 is in the domain of µν because −1 δνε (µ)dµ =1 functions for all ε 0, 1 0, if |µ − ν| ≤ 1/n lim |µ − ν|δνε (µ)dµ = 0 ε→0 −1 gn (µ) = g(µ), otherwise implying that (µ − ν)−1 is unbounded. Taken together, (a) and (b) show that func- in R(µν ) to be eventually in every neighbor- tional solutions of Eq. (63) lead to state 2–2 in 1 Table 1; hence ν ∈ [−1, 1] = Cσ(µ). hood of g in the sense that limn→∞ −1 |g − gn | = 0. (c) The two integral constraints in (b) also mean (b) The inverse (µ − ν)−1 exists but is not contin- that ν ∈ Cσ(µ) is a generalized eigenvalue uous. The inverse exists because, as noted ear- of µ which justifies calling the graphical limit G lier, 0 is the only functional solution of Eq. (63). δνε (µ) → δν (µ) a generalized, or singular, eigen- 20 20 −32 32 −0.5 0.5 0.5 0 0 −0.5 −0.5 0 0 0.5 −32 −32 (a) (a) (b) (b) (a) (b) Fig. 17. Graphical convergence of: (a) Poisson kernel δε (x) = ε/π(x2 + ε2 ) and (b) conjugate Poisson kernel Pε (x) = x/(x2 + ε2 ) to the Dirac delta and principal value, respectively; the graphs, each for a definite ε-value, converges to the respective limits as ε → 0.
  • 52. 3198 A. Sengupta function, see Fig. 17 which clearly indicates the with convergence of the net of functions. 26 1 dµ 1 ε→0 πε = ε = 2 tan−1 −→ π . −1 µ2 + ε2 ε From the fact that the solution Eq. (A.56) of the transport equation contains an integral involv- These discretized equations should be compared ing the multifunction φ(µ, ν), we may draw an in- with the corresponding exact ones of Appendix A.4. teresting physical interpretation. As the multi ap- We shall see that the net of functions (65) con- pears everywhere on V (µ) (i.e. there are no chaotic verges graphically to the multifunction Eq. (A.55) orbits but only the multifunctions that produce as ε → 0. them), we have here a situation typical of maximal In the discretized spectral approximation, ill-posedness characteristic of chaos: note that both the singular eigenfunction φ(µ, ν) is replaced by the functions comprising φε (µ, ν) are non-injective. φε (µ, ν), ε → 0, with the integral in ν being replaced As the solution (A.56) involves an integral over all by an appropriate sum. The solution Eq. (A.58) of ν ∈ V (µ), the singular eigenfunctions — that col- the physically interesting half-space x ≥ 0 problem lectively may be regarded as representing a chaotic then reduces to [Sengupta, 1988, 1995] substate of the system represented by the solution of Φε (x, µ) = a(ν0 )e−x/ν0 φ(µ, ν0 ) the neutron transport equation — combine with the N functional components φ(±ν0 , µ) to produce the + a(νi )e−x/νi φε (µ, νi ) µ ∈ [0, 1] well-defined, non-chaotic, experimental end result i=1 of the neutron flux Φ(x, µ). The solution (A.56) is obtained by assuming (66) Φ(x, µ) = e−x/ν φ(µ, ν) to get the equation for where the nodes {νi }N are chosen suitably. This i=1 φ(µ, ν) to be (µ − ν)φ(µ, ν) = −cν/2 with the nor- discretized spectral approximation to Case’s so- 1 malization −1 φ(µ, ν) = 1. As µ−1 is not invert- ν lution has given surprisingly accurate numerical ible in Multi(V (µ), R) and µνB : XB → f (X) does results for a set of properly chosen nodes when not exist, the alternate approach of regularization compared with exact calculations. Because of its was adopted in [Sengupta, 1988, 1995] to rewrite involved nature [Case Zweifel, 1967], the exact µν φ(µ, ν) = −cν/2 as µνε φε (µ, ν) = −cν/2 with calculations are basically numerical which leads to µνε := µ − (ν + iε) being a net of bijective func- nonlinear integral equations as part of the solu- tions for ε 0; this is a consequence of the fact tion procedure. To appreciate the enormous com- that for the multiplication operator every non-real plexity of the exact treatment of the half-space λ belongs to the resolvent set of the operator. The problem, we recall that the complete set of eigen- family of solutions of the latter equation is given by functions {φ(µ, ν0 ), {φ(µ, ν)}ν∈[0,1] } are orthogo- [Sengupta 1988, 1995] nal with respect to the half-range weight function W (µ) of half-range theory, Eq. (A.61), that is ex- cν ν−µ λε (ν) ε pressed only in terms of solution of the nonlin- φε (ν, µ) = + 2 (µ − ν) 2 + ε2 πε (µ − ν)2 + ε2 ear integral equation Eq. (A.62). The solution of (65) a half-space problem then evaluates the coefficients {a(ν0 ), a(ν)ν∈[0, 1] } from the appropriate half range where the required normalization 1 φε (ν, µ) = 1 (that is 0 ≤ µ ≤ 1) orthogonality integrals satisfied −1 gives by the eigenfunctions {φ(µ, ν0 ), {φ(µ, ν)}ν∈[0, 1] } with respect to the weight W (µ), see Appendix A.4 πε for the necessary details of the half-space problem λε (ν) = tan−1 (1 + ν)/ε + tan−1 (1 − ν)/ε in neutron transport theory. As may be appreciated from this brief introduc- cν (1 + ν)2 + ε2 × 1− ln tion, solutions to half-space problems are not sim- 4 (1 − ν)2 + ε2 ple and actual numerical computations must rely a ε→0 −→ πλ(ν) great deal on tabulated values of the X-function. 26 The technical definition of a generalized eigenvalue is as follows. Let L be a linear operator such that there exists in the domain of L a sequence of elements (xn ) with xn = 1 for all n. If limn→∞ (L − λ)xn = 0 for some λ ∈ C, then this λ is a generalized eigenvalue of L, the corresponding eigenfunction x∞ being a generalized eigenfunction.
  • 53. Toward a Theory of Chaos 3199 Self-consistent calculations of sample benchmark nature of the exact theory, it is our contention that problems performed by the discretized spectral ap- the remarkable accuracy of these basic data, some proximation in a full-range adaption of the half- of which is reproduced in Table 2, is due to the range problem described below that generate all graphical convergence of the net of functions necessary data, independent of numerical tables, G with the quadrature nodes {νi }N taken at the φε (µ, ν) → φ(µ, ν) i=1 zero Legendre polynomials show that the full range shown in Fig. 18; here ε = 1/πN so that ε → 0 formulation of this approximation [Sengupta, 1988, as N → ∞. By this convergence, the delta 1995] can give very accurate results not only of inte- function and principal values in [−1, 1] are the grated quantities like the flux Φ and leakage of par- multifunctions ([−1, 0), 0) (0, [0, ∞) ((0, 1], 0) ticles out of the half space, but of also basic “raw” and {1/x}x∈[−1, 0) (0, (−∞, ∞)) {1/x}x∈(0, 1] data like the extrapolated end point respectively. Tables 2 and 3, taken from [Sengupta, cν0 1 ν cν 2 ν0 + ν 1988] and [Sengupta, 1995], show respectively the z0 = 1+ ln dν extrapolated end point and X-function by the 4 0 N (ν) 1 − ν2 ν0 − ν full-range adaption of the discretized spectral ap- (67) proximation for two different half-range problems and of the X-function itself. Given the involved denoted as Problems A and B defined as c = c = 0.3 0.3 c = c = 0.9 0.9 c = c = 0.3 0.3 c = c = 0.9 0.9 N = 1000 N = 1000 N = 1000 N = 1000 (a) (a) (a) (b) (b) (b) Fig. 18. Rational function approximations φε (µ, ν) of the singular eigenfunction φ(µ, ν) at four different values of ν. N = 1000 denotes the “converged” multifunction φ, with the peaks at the specific ν-values chosen.
  • 54. 3200 A. Sengupta 1 P roblem A Equation : µΦx + Φ = (c/2) −1 Φ(x, µ )dµ , x ≥ 0 Boundary condition : Φ(0, µ) = 0 for µ ≥ 0 Asymptotic condition : Φ → e−x/ν0 φ(µ, ν0 ) as x → ∞ . 1 P roblem B Equation : µΦx + Φ = (c/2) −1 Φ(x, µ )dµ , x≥0 Boundary condition : Φ(0, µ) = 1 for µ ≥ 0 Asymptotic condition : Φ → 0 as x → ∞ . The full −1 ≤ µ ≤ 1 range form of the half of the full-range weight function µ as compared to 0 ≤ µ ≤ 1 range discretized spectral approxima- the half-range function W (µ), and the resulting sim- tion replaces the exact integral boundary condition plicity of the orthogonality relations that follow, see at x = 0 by a suitable quadrature sum over the val- Appendix A.4. The basic data of z0 and X(−ν) ues of ν taken at the zeros of Legendre polynomials; are then completely generated self-consistently thus the condition at x = 0 can be expressed as [Sengupta, 1988, 1995] by the discretized spectral approximation from the full-range adaption N N ψ(µ) = a(ν0 )φ(µ, ν0 ) + a(νi )φε (µ, νi ) , i=1 (68) bi φε (µ, νi ) = ψ+ (µ) + ψ− (µ) , (69) i=0 µ ∈ [0, 1] , µ ∈ [−1, 1], νi ≥ 0 where ψ(µ) = Φ(0, µ) is the specified incoming radiation incident on the boundary from the left, Table 2. Extrapolated end-point z0 . and the half-range coefficients a(ν0 ), {a(ν)}ν∈[0,1] cz0 are to be evaluated using the W -function of Appendix A.4. We now exploit the relative sim- c N =2 N =6 N = 10 Exact plicity of the full-range calculations by replacing 0.2 0.78478 0.78478 0.78478 0.7851 Eq. (68) by Eq. (69) following, where the coefficients 0.4 0.72996 0.72996 0.72996 0.7305 N {b(νi )}i=0 are used to distinguish the full-range co- 0.6 0.71535 0.71536 0.71536 0.7155 efficients from the half-range ones. The significance 0.8 0.71124 0.71124 0.71124 0.7113 of this change lies in the overwhelming simplicity 0.9 0.71060 0.71060 0.71061 0.7106 Table 3. X(−ν) by the full-range method. X(−ν) c N νi Problem A Problem B Exact 0.2133 0.8873091 0.8873091 0.887308 0.2 2 0.7887 0.5826001 0.5826001 0.582500 0.0338 1.3370163 1.3370163 1.337015 0.1694 1.0999831 1.0999831 1.099983 0.3807 0.8792321 0.8792321 0.879232 0.6 6 0.6193 0.7215240 0.7215240 0.721524 0.8306 0.6239109 0.6239109 0.623911 0.9662 0.5743556 0.5743556 0.574355 0.0130 1.5971784 1.5971784 1.597163 0.0674 1.4245314 1.4245314 1.424532 0.1603 1.2289940 1.2289940 1.228995 0.2833 1.0513750 1.0513750 1.051376 0.4255 0.9058140 0.9058410 0.905842 0.9 10 0.5744 0.7934295 0.7934295 0.793430 0.7167 0.7102823 0.7102823 0.710283 0.8397 0.6516836 0.6516836 0.651683 0.9325 0.6136514 0.6136514 0.613653 0.9870 0.5933988 0.5933988 0.593399
  • 55. Toward a Theory of Chaos 3201 of the discretized boundary condition Eq. (68), the required bj from these “negative” coefficients. where ψ+ (µ) is by definition the incident flux ψ(µ) By equating these calculated bi with the exact half- for µ ∈ [0, 1] and 0 if µ ∈ [−1, 0], while range expressions for a(ν) with respect to W (µ) as  outlined in Appendix A.4, it is possible to find nu-  N merical values of z0 and X(−ν). Thus from the sec- b− φε (µ, νi ) if µ ∈ [−1, 0], νi ≥ 0   ψ− (µ) = i=0 i ond of Eq. (A.64), {X(−νi )}N is obtained with i=1   biB = aiB , i = 1, . . . , N , which is then substituted 0 if µ ∈ [0, 1]  in the second of Eq. (A.63) with X(−ν 0 ) obtained is the emergent angular distribution out of the from aA (ν0 ) according to Appendix A.4, to compare medium. Equation (69) corresponds to the full- the respective aiA with the calculated biA from (71). range µ ∈ [−1, 1], νi ≥ 0 form Finally the full-range coefficients of Problem A can be used to obtain the X(−ν) values from the sec- 1 b(ν0 )φ(µ, ν0 ) + b(ν)φ(µ, ν)dν ond of Eqs. (A.63) and compared with the exact 0 tabulated values as in Table 3. The tabulated val- 1 ues of cz0 from Eq. (67) show a consistent deviation = ψ+ (µ) + b− (ν0 )φ(µ, ν0 ) + b− (ν)φ(µ, ν)dν from our calculations of Problem A according to 0 aA (ν0 ) = − exp(−2z0 /ν0 ). Since the X(−ν) values (70) of Problem A in Table 3 also need the same b 0A as of boundary condition (A.59) with the first and sec- input that was used in obtaining z0 , it is reasonable ond terms on the right having the same interpre- to conclude that the “exact” numerical integration tation as for Eq. (69). This full-range simulation of z0 is inaccurate to the extent displayed in Table 2. merely states that the solution (A.58) of Eq. (A.53) From these numerical experiments and Fig. 18 holds for all µ ∈ [−1, 1], x ≥ 0, although it was ob- we may conclude that the continuous spectrum tained, unlike in the regular full-range case, from [−1, 1] of the position operator µ acts as the D + the given radiation ψ(µ) incident on the bound- points in generating the multifunctional Case sin- ary at x = 0 over only half the interval µ ∈ [0, 1]. gular eigenfunction φ(µ, ν). Its rational approxima- To obtain the simulated full-range coefficients {b i } tion φε (µ, ν) in the context of the simple simulated and {b− } of the half-range problem, we observe that i full-range computations of the complex half-range there are effectively only half the number of coef- exact theory of Appendix A.4, clearly demonstrates ficients as compared to a normal full-range prob- the utility of graphical convergence of sequence of lem because ν is now only over half the full inter- functions to multifunction. The totality of the mul- val. This allows us to generate two sets of equations tifunctions φ(µ, ν) for all ν in Figs. 18(c) and 18(d) from (70) by integrating with respect to µ ∈ [−1, 1] endows the problem with the character of max- with ν in the half intervals [−1, 0] and [0, 1] to imal ill-posedness that is characteristic of chaos. obtain the two sets of coefficients b− and b, re- This chaotic signature of the transport equation is spectively. Accordingly we get from Eq. (69) with however latent as the experimental output Φ(x, µ) j = 0, 1, . . . , N the sets of equations is well-behaved and regular. This important exam- N ple shows how nature can use hidden and complex (+) (ψ, φj− )µ =− b− (φi+ , φj− )µ (−) chaotic substates to generate order through a pro- i i=0 cess of superposition. N 1 bj = (ψ, φj+ )(+) + µ b− (φi+ , φj+ )µ i (−) 6. Does Nature Support Nj i=0 Complexity? (71) The question of this section is basic in the light of where (φj± )N represents (φε (µ, ±νj ))N , φ0± = j=1 j=1 the theory of chaos presented above as it may be φ(µ, ±ν0 ), the (+) (−) superscripts are used to reformulated to the inquiry of what makes nature denote the integrations with respect to µ ∈ [0, 1] support chaoticity in the form of increasing non- and µ ∈ [−1, 0] respectively, and (f, g) µ denotes injectivity of an input–output system. It is the pur- the usual inner product in [−1, 1] with respect pose of this section to exploit the connection be- to the full range weight µ. While the first set of tween spectral theory and the dynamics of chaos N + 1 equations give b− , the second set produces i that has been presented in the previous section.
  • 56. 3202 A. Sengupta Since linear operators on finite dimensional spaces of functions whose images under the respective L λ do not possess continuous or residual spectra, spec- converge to 0; recall the definition of footnote 26. tral theory on infinite dimensional spaces essentially This observation generalizes to the dense extension involves limiting behavior to infinite dimensions of Multi| (X, Y ) of Map(X, Y ) as follows. If x ∈ D + the familiar matrix eigenvalue–eigenvector problem. is not a fixed point of f (λ; x) = x, but there is As always this means extensions, dense embeddings some n ∈ N such that f n (λ; x) = x, then the limit and completions of the finite dimensional problem n → ∞ generates a multifunction at x as was the that show up as generalized eigenvalues and eigen- case with the delta function in the previous section vectors. In its usual form, the goal of nonlinear spec- and the various other examples that we have seen tral theory consists [Appell et al., 2000] in the study so far in the earlier sections. −1 of Tλ for nonlinear operators Tλ that satisfy more One of the main goals of investigations on the general continuity conditions, like differentiability spectrum of nonlinear operators is to find a set in and Lipschitz continuity, than simple boundedness the complex plane that has the usual desirable prop- that is sufficient for linear operators. The following erties of the spectrum of a linear operator [Appell generalization of the concept of the spectrum of a et al., 2000]. In this case, the focus has been to find linear operator to the nonlinear case is suggestive. a suitable class of operators C(X) with T ∈ C(X), For a nonlinear map, λ need not appear only in a such that the resolvent set is expressed as multiplying role, so that an eigenvalue equation can ρ(T ) = {λ ∈ C : (Tλ is 1 : 1)(Cl(R(Tλ ) = X) be written more generally as a fixed-point equation −1 and (Tλ ∈ C(X) on R(Tλ ))} f (λ; x) = x with the spectrum σ(T ) being defined as the com- with a fixed point corresponding to the eigenfunc- plement of this set. Among the classes C(X) that tion of a linear operator and an “eigenvalue” being have been considered, beside spaces of continu- the value of λ for which this fixed point appears. ous functions C(X), are linear boundedness B(X), The correspondence of the residual and continu- Frechet differentiability C 1 (X), Lipschitz continu- ous parts of the spectrum are, however, less trivial ity Lip(X), and Granas quasiboundedness Q(x), than for the point spectrum. This is seen from the where Lip(X) specifically takes into account the following two examples [Roman, 1975]. Let Ae k = nonlinearity of T to define λk ek , k = 1, 2, . . . be an eigenvalue equation with ej being the jth unit vector. Then (A − λ)e k := T (x) − T (y) T Lip = supx=y , (λk − λ)ek = 0 iff λ = λk so that {λk }∞ ∈ P σ(A) x−y k=1 (72) are the only eigenvalues of A. Consider now (λ k )∞k=1 T (x) − T (y) to be a sequence of real numbers that tends to |T |lip = inf x=y = x−y a finite λ∗ ; for example, let A be a diagonal ma- trix having 1/k as its diagonal entries. Then λ ∗ that are plain generalizations of the corresponding − belongs to the continuous spectrum of A because norms of linear operators. Plots of f λ (y) = {x ∈ (A − λ∗ )ek = (λk − λ∗ )ek with λk → λ∗ implies D(f −λ) : (f −λ)x = y} for the functions f : R → R that (A − λ∗ )−1 is an unbounded linear operator   −1 − λx, x −1 and λ∗ a generalized eigenvalue of A. In the second fλa (x) = (1 − λ)x, −1 ≤ x ≤ 1 example Aek = ek+1 /(k + 1), it is not difficult to  1 − λx, 1x verify that: (a) The point spectrum of A is empty,  (b) The range of A is not dense because it does  −λx, x1 not contain e1 , and (c) A−1 is unbounded because fλb (x) = (1 − λ)x − 1, 1 ≤ x ≤ 2 Aek → 0. Thus the generalized eigenvalue λ ∗ = 0 in 1 − λx, 2x  this case belongs to the residual spectrum of A. In −λx √ x1 either case, limj→∞ ej is the corresponding general- fλc (x) = x − 1 − λx 1≤x ized eigenvector that enlarges the trivial null space N (Lλ∗ ) of the generalized eigenvalue λ∗ . In fact (x − 1)2 + 1 − λx 1≤x≤2 in these two and the Dirac delta example of Sec. fλd (x) = (1 − λ)x otherwise 5 of continuous and residual spectra, the general- ized eigenfunctions arise as the limits of a sequence fλe (x) = tan−1 (x) − λx
  • 57. −1 2 −4 0 0 1 1 −4 −4 44 Toward a Theory of Chaos −5 3203 (a) −5 −1 5 5 −5 4 4 (a) −1 10 (b) (a) (b) 10 10 33 0 0 2 4 11 00 1 22 1 00 0.5 1 0 2 1 0 −1 3 −1 −1 22 0.5 0.5 00 −1 −1 −1 33 3 3 −1 −3 2 −1 1 2 2 1 1 1.5 3 −1 −0.5 0 0 1 1 1.5 2 −4 −41.5 −1 2 −4 −0.5 −0.5 44 2 2 −4 −4 55 −1 −1 −3 3 −4 4 −3 −3 33 −5 −5 −1 −1 −1 (a) (a) (b) (b) (c) (a) 10 10 (b) −1 −1 33 (c) −1 −1 −1 (c) (c) (d) 10 (d) (b) 3 22 10 11 10 00 10 10 0 1.5 −1 −1 2 0.5 0.5 1 0 0 0 0 0 0.5 1.5 1.5 11 0.5 0.5 00 −0.5 −0.5 2 33 −1 0.5 0.5 1 22 0 −1 −1 3 1 1 2 −0.5 1 −2 2 −10 2 2 −0.5 −0.5 2 −0.5 1.5 −0.5 −0.5 −2 −2 22 −10 −10 1 10 10 −0.5 2 2 55 2 −4 −4−0.5 −1 −1 1 1 −1 −0.5 0 0.5 −3 −3 −4 3 3 5 −1 −1 22 −0.5 −1 0 0 0.5 0.5 −1 −1 −0.5 −0.5 00 0.5 0.5 11 1.5 1.5 3 −1 −1 (c) (c) (d) (d) −10 (e) 10 −110 −10 −1010 10 −10 −10 (d) (e) (e) (f) (f) (d) (e) (f) 10 0 0 1.5 1.5 11 0.5 0.5 00 −0.5 −0.5 2 0.5 0.5 Fig. 19. Inverses of fλ =2 2 2 f − λ. The λ-values are2shown on the graphs. 0 1.5 1 0.5 0 −0.5 −1 −1 1 1 2 2 2 √ −0.5 −1 −0.5  1 − 2 −x − λx,2 x−10−1 the complement of the resolvent set, is more diffi-  −2 −2 2 −10 10 10 22 cult to find. Here the convenient characterization of −0.5  fλf (x) = (1 − λ)x, 1 −1 ≤ x ≤ 1 −0.5 −0.5 2 1 10 −10  √ the resolvent of a continuous linear operator as the 2 2 2 x − 1 − λx, 1 x −1  1 0 0 0.5 −1 0.5 set of all 2sufficiently large λ that satisfy |λ| M is 0 0.5 11 1.5 0.5 of little significance as, unlike for a linear operator, taken from [Appell et al., 2000] are shown −0.5 0 1.5 −0.5 −1 2 in Fig. 19. 0.5 It is easy 0 verify that 1.5 Lipschitz and linear −0.5 to −10 1 0.5 the −10 the non-existence of an inverse is not just due to upper and lower bounds of these maps are as in −1 (0)} which happens to be the only way (f)the set {f −10 −10 (e) (e) (f) Table 4. −10 a linear map can fail to be injective. Thus the map The point spectrum defined by (f) 2 defined piecewise as α + 2(1 − α)x for 0 ≤ x 1/2 2 P σ(f ) = {λ ∈ C : (f − λ)x = 0 for some x = 0} and 2(1 − x) for 1/2 ≤ x ≤ 1, with 0 α 1, is 2 not invertible on its range although {f − (0)} = 1. is the simplest to calculate. Because of the spe- Comparing Fig. 19 and Table 4, it is seen that in cial role played by the zero element 0 in generating cases (b)–(d), the intervals [|f |b , f B ] are subsets the point spectrum in the linear case, the bounds m x ≤ Lx ≤ M x together with Lx = λx of the λ-values for which the respective maps are imply Cl(P σ(L)) = [ L b , L B ] — where the not injective; this is to be compared with (a), (e) subscripts denote the lower and upper bounds in and (f) where the two sets are the same. Thus the Eq. (72) and which sometimes is taken to be a de- linear bounds are not good indicators of the unique- scriptor of the point spectrum of a nonlinear op- ness properties of solution of nonlinear equations for erator — as can be seen in Table 5 and verified which the Lipschitzian bounds are seen to be more from Fig. 19. The remainder of the spectrum, as appropriate.
  • 58. 3204 A. Sengupta Table 4. Bounds on the functions of Fig. 19. Thus apart from multifunctions, λ ∈ σ(f ) also generates functions on the boundary of functional Function |f |b f |f |lip f B Lip and non-functional relations in Multi(X, T ). While fa 0 1 0 1 it is possible to classify the spectrum into point, fb 0 1/2 0 1 continuous and residual subsets, as in the linear fc 0 1/2 0 ∞ case, it is more meaningful for nonlinear opera- √ tors to consider λ as being either in the bound- fd 2( 2 − 1) ∞ 0 2 ary spectrum Bdy(σ(f )) or in the interior spectrum fe 0 1 0 1 Int(σ(f )), depending on whether or not the mul- ff 0 1 0 1 tifunction f (λ; ·)− arises as the graphical limit of a net of functions in either ρ(f ) or Rσ(f ). This is suggested by the spectra arising from the sec- Table 5. Lipschitzian and point spectra ond row of Table 1 (injective Lλ and discontinu- of the functions of Fig. 19. ous L−1 ) that lies sandwiched in the λ-plane be- λ tween the two components arising from the first Functions σLip (f ) P σ(f ) and third rows, see [Naylor Sell, 1971, Sec. 6.6], fa [0, 1] (0, 1] for example. According to this simple scheme, the fb [0, 1] [0, 1/2] spectral set is a closed set with its boundary and interior belonging to Bdy(σ(f )) and Int(σ(f )), re- fc [0, ∞) [0, 1/2] √ spectively. Table 6 shows this division for the ex- fd [0, 2] [2( 2 − 1), 1] amples in Fig. 19. Because 0 is no more significant fe [0, 1] (0, 1) than any other point in the domain of a nonlin- ff [0, 1] (0, 1) ear map in inducing non-injectivity, the division of the spectrum into the traditional sets would be as shown in Table 6; compare also with the conven- In view of the above, we may draw the follow- tional linear point spectrum of Table 5. In this non- ing conclusions. If we choose to work in the space linear classification, the point spectrum consists of of multifunctions Multi(X, T ), with T the topology any λ for which the inverse f (λ; ·)− is set-valued, of pointwise biconvergence, when all functional re- irrespective of whether this is produced at 0 or not, lations are (multi)invertible on their ranges, we may while the continuous and residual spectra together make the following definition for the net of functions comprise the boundary spectrum. Thus a λ can be f (λ; x) satisfying f (λ; x) = x. both at the point and the continuous or residual spectra which need not be disjoint. The continuous Definition 6.1. Let f (λ; ·) ∈ Multi(X, T ) be a and residual spectra are included in the boundary function. The resolvent set of f is given by spectrum which may also contain parts of the point spectrum. ρ(f ) = {λ : (f (λ; ·)−1 ∈ Map(X, T )) ∧(Cl(R(f (λ; ·)) = X)} , Example 6.1. To see how these concepts apply to linear mappings, consider the equation (D − and any λ not in ρ is in the spectrum of f . λ)y(x) = r(x) where D = d/dx is the differential Table 6. Nonlinear spectra of functions of Fig. 19. Compare the present point spectra with the usual linear spectra of Table 5. Function Int(σ(f )) Bdy(σ(f )) P σ(f ) Cσ(f ) Rσ(f ) fa (0, 1) {0, 1} [0, 1] {1} {0} fb (0, 1) {0, 1} [0, 1] {1} {0} fc (0, ∞) {0} [0, ∞) {0} ∅ fd (0, 2) {0, 2} (0, 2) {0, 2} ∅ fe (0, 1) {0, 1} (0, 1) {1} {0} ff (0, 1) {0, 1} (0, 1) {0, 1} ∅
  • 59. Toward a Theory of Chaos 3205 operator on L2 [0, ∞), and let λ be real. For λ = 0, by the graphical convergence of a net of resolvent the unique solution of this equation in L 2 [0, ∞), is functions while the multifunctions in the interior of  x the spectral set evolve graphically independent of  λx e  y(0) + e−λx r(x )dx , λ 0 the functions in the resolvent. The chaotic states forming the boundary of the functional and multi-  0 y(x) = ∞ functional subsets of Multi(X) marks the transition e−λx r(x )dx  λx e y(0) − λ0   from the less efficient functional state to the more x efficient multifunctional one. showing that for λ 0 the inverse is functional so These arguments also suggest the following. that λ ∈ (0, ∞) belongs to the resolvent of D. How- The countably many outputs arising from the non- ever, when λ 0, apart from the y = 0 solution injectivity of f (λ; ·) corresponding to a given input (since we are dealing with a linear problem, only can be interpreted to define complexity because in r = 0 is to be considered), eλx is also in L2 [0, ∞) a nonlinear system each of these possibilities con- so that all such λ are in the point spectrum of D. stitute an experimental result in itself that may not For λ = 0 and r = 0, the two solutions are not nec- be combined in any definite predtermined manner. ∞ essarily equal unless 0 r(x) = 0, so that the range This is in sharp contrast to linear systems where R(D − I) is a subspace of L2 [0, ∞). To complete a linear combination, governed by the initial con- the problem, it is possible to show [Naylor Sell, ditions, always generate a unique end result; recall 1971] that 0 ∈ Cσ(D), see Example 2.2; hence the also the combination offered by the singular gen- continuous spectrum forms at the boundary of the eralized eigenfunctions of neutron transport the- functional solution for the resolvent-λ and the mul- ory. This multiplicity of possibilities that have no tifunctional solution for the point spectrum. With definite combinatorial property is the basis of the a slight variation of problem to y(0) = 0, all λ 0 diversity of nature, and is possibly responsible for are in the resolvent set, while λ 0 the inverse is Feigenbaum’s “historical prejudice”, [Feigenbaum, ∞ bounded but must satisfy y(0) = 0 e−λx r(x)dx = 1992], see Prelude 2. Thus order represented by 0 so that Cl(R(D−λ)) = L2 [0, ∞). Hence λ 0 be- the functional resolvent passes over to complexity long to the residual spectrum. The decomposition of the countably multifunctional interior spectrum of the complex λ-plane for these and some other via the uncountably multifunctional boundary that linear spectral problems taken from [Naylor Sell, is a prerequisite for chaos. We may now strengthen 1971] is shown in Fig. 20. In all cases, the spectrum our hypothesis offered at the end of the previous due to the second row of Table 1 acts as a boundary section in terms of the examples of Figs. 19 and between that arising from the first and third rows, 20, that nature uses chaoticity as an intermediate which justifies our division of the spectrum for a step to the attainment of states that would other- nonlinear operator into the interior and boundary wise be inaccessible to it. Well-posedness of a sys- components. Compare with Example 2.2. tem is an extremely inefficient way of expressing a multitude of possibilities as this requires a different From the basic representation of the resolvent input for every possible output. Nature chooses to operator (1 − f )−1 express its myriad manifestations through the mul- 1 + f + f2 + · · · + fi + · · · tifunctional route leading either to averaging as in the delta function case or to a countable set of well- in Multi(X), if the iterates of f converge to a multi- defined states, as in the examples of Fig. 19 corre- function for some λ, then that λ must be in the spec- sponding to the interior spectrum. Of course it is no trum of f , which means that the control parameter distraction that the multifunctional states arise re- − of a chaotic dynamical system is in its spectrum. Of spectively from fλ and fλ in these examples as f is course, the series can sum to a multi even otherwise: a function on X that is under the influence of both take fλ (x) to be identically x with λ = 1, for exam- f and its inverse. The functional resolvent is, for all ple, to get 1 ∈ P σ(f ). A comparison of Tables 1 and practical purposes, only a tool in this structure of 5 reveal that in case (d), for example, 0 and 2 belong nature. −1 to the Lipschtiz spectrum because although f d is The equation f (x) = y is typically an input– not Lipschitz continuous, f Lip = 2. It should also output system in which the inverse images at a func- be noted that the boundary between the functional tional value y0 represents a set of input parameters resolvent and multifunctional spectral set is formed leading to the same experimental output y 0 ; this
  • 60. 3206 A. Sengupta Resolvent Resolvent Resolvent Resolvent Resolvent Resolvent Resolvent Resolvent Resolvent Resolvent set set Resolvent set Resolvent set set Resolvent set Resolvent setset Resolvent set set set set set set set Continuous Residual Continuous Residual Continuous Residual Continuous Point Continuous Continuous Point Point Continuous Continuous Continuous Continuous spectrum Continuous Residual spectrum spectrum Residual spectrum spectrum spectrum spectrum spectrum spectrum Point Continuous spectrum Point Continuous spectrum spectrum Continuous Continuous spectrum spectrum spectrum spectrum spectrum spectrum spectrum spectrum spectrum spectrum spectrum spectrum spectrum (a) (b) (c) λ-plane λ-plane λ-plane λ-plane λ-plane Resolvent Resolvent Resolvent Resolvent Resolvent Resolvent Residual Point Residual Residual Point Point Resolvent Resolvent Resolvent Resolvent Resolvent set set set Resolvent Resolvent set set set Residual Residual Point Point spectrum spectrum spectrum spectrum spectrum spectrum § ! Resolvent Resolvent £ !% ¦# $   £ set set set § set set set set spectrum spectrum spectrum spectrum ¨¦¤¢  © § ¥ £ ¡ set set )¢  § £ Continuous Continuous Continuous Continuous Continuous Continuous Continuous Continuous Continuous   ¤ ! § % ( Continuous Continuous spectrum spectrum spectrum Continuous Continuous spectrum spectrum spectrum Continuous Continuous spectrum spectrum spectrum '¦¤¢  © § ¥ £ ¡ spectrum spectrum spectrum spectrum spectrum spectrum £ 54$ ¡ 20 3 1 (d) (e) (f) Fig. 20. Spectra of some linear operators in the complex λ-plane. (a) Left shift (. . . , x−1 , x0 , x1 , . . .) → (. . . x0 , x1 , x2 , . . .) on l2 (−∞, ∞), (b) Right shift (x0 , x1 , x2 , . . .) → (0, x0 , x1 , . . .) on l2 [0, ∞), (c) Left shift (x0 , x1 , x2 , . . .) → (x1 , x2 , x3 , . . .) on l2 [0, ∞) of sequence spaces, and (d) d/dx on L2 (−∞, ∞) (e) d/dx on L2 [0, ∞) with y(0) = 0 and (f) d/dx on L2 [0, ∞). The residual spectrum in (b) and (e) arise from block (3–3) in Table 1, i.e. Lλ is one-to-one and L−1 is bounded on non-λ dense domains in l2 [0, ∞) and L2 [0, ∞), respectively. The continuous spectrum therefore marks the boundary between two functional states, as in (a) and (e), now with dense and non-dense domains of the inverse operator. is stability characterized by a complete insensitiv- is larger than a functional state represented by the ity of the output to changes in input. On the other singleton {f (x0 )}. hand, a continuous multifunction at x 0 is a signal for a hypersensitivity to input because the output, Epilogue which is a definite experimental quantity, is a choice The most passionate advocates of the new science from the possibly infinite set {f (x0 )} made by a go so far as to say that twentieth-century science choice function which represents the experiment at will be remembered for just three things: relativity, that particular point in time. Since there will always quantum mechanics and chaos. Chaos, they contend, be finite differences in the experimental parameters has become the century’s third great revolution in when an experiment is repeated, the choice function the physical sciences. Like the first two revolutions, (that is the experimental output) will select a point chaos cuts away at the tenets of Newton’s physics. As from {f (x0 )} that is representative of that experi- one physicist put it: “Relativity eliminated the New- ment and which need not bear any definite relation tonian illusion of absolute space and time; quantum to the previous values; this is instability and sig- theory eliminated the Newtonian dream of a con- nals sensitivity to initial conditions. Such a state is trollable measurement process; and chaos eliminates of high entropy as the number of available states the Laplacian fantasy of deterministic predictability.” fC ({f (x0 )}) — where fC is the choice function — Of the three, the revolution in chaos applies to the 1 11 11
  • 61. Toward a Theory of Chaos 3207 universe we see and touch, to objects at human scale. Goldenfeld, N. Kadanoff, L. P. [1999] “Simple lessons . . . There has long been a feeling, not always expressed from complexity,” Science 284, 87–89. openly, that theoretical physics has strayed far from Korevaar, J. [1968] Mathematical Methods, Vol. 1 (Aca- human intuition about the world. Whether this will demic Press, NY). prove to be fruitful heresy, or just plain heresy, no one Naylor, A. W. Sell, G. R. [1971] Linear Operator Theory is Engineering and Science Holt (Rinehart and knows. But some of those who thought that physics Winston, NY). might be working its way into a corner now look to Peitgen, H.-O., Jurgens, H. Saupe, D. [1992] Chaos chaos as a way out. and Fractals: New Frontiers of Science (Springer- [Gleick, 1987] Verlag, NY). Robinson, C. [1999] Dynamical Systems: Stability, Sym- bolic Dynamics and Chaos (CRC Press LLC, Boca Acknowledgments Raton). It is a pleasure to thank the referees for recom- Roman, P. [1975] Some Modern Mathematics for Physi- mending an enlarged Tutorial and Review revision cists and other Outsiders (Pergammon Press, NY). of the original submission Graphical Convergence, Sengupta, A. [1995a] “A discretized spectral approxima- Chaos and Complexity, and Professor Leon O. Chua tion in neutron transport theory. Some numerical con- siderations,” J. Stat. Phys. 51, 657–676. for suggesting a pedagogically self-contained, jar- Sengupta, A. [1995b] “Full range solution of half-space gonless version accessible to a wider audience for neutron transport problem,” ZAMP 46, 40–60. the present form of the paper. Financial assis- Sengupta, A. [1997] “Multifunction and generalized in- tance during the initial stages of this work from verse,” J. Inverse and Ill-Posed Problems 5, 265–285. the National Board for Higher Mathematics is also Sengupta, A. Ray, G. G. [2000] “A multifunctional ex- acknowledged. tension of function spaces: Chaotic systems are maxi- mally ill-posed,” J. Inverse and Ill-Posed Problems 8, 232–353. References Stuart, A. M. Humphries, A. R. [1996] Dynamical Alligood, K. T., Sauer, T. D. Yorke, J. A. [1997] Chaos, Systems and Numerical Analysis (Cambridge Univer- An Introduction to Dynamical Systems (Springer- sity Press). Verlag, NY). Tikhonov, A. N. Arsenin, V. Y. [1977] Solutions of Ill- Appell, J., DePascale, E. Vignoli, A. [2000] “A com- Posed Problems (V. H. Winston, Washington D.C.). parison of different spectra for nonlinear operators,” Waldrop, M. M. [1992] Complexity: The Emerging Sci- Nonlin. Anal. 40, 73–90. ence at the Edge of Order and Chaos (Simon and Brown, R. Chua, L. O. [1996] “Clarifying chaos: Schuster). Examples and counterexamples,” Int. J. Bifurcation Willard, S. [1970] General Topology (Addison-Wesley, and Chaos 6, 219–249. Reading, MA). Campbell, S. I. Mayer, C. D. [1979] Generalized Williams, M. M. R. [1971] Mathematical Methods of Inverses of Linear Transformations (Pitman Publish- Particle Transport Theory (Butterworths, London). ing Ltd., London). Case, K. M. Zweifel, P. F. [1967] Linear Transport Theory (Addison-Wesley, MA). Appendix de Souza, H. G. [1997] “Opening address,” in The Im- This Appendix gives a brief overview of some as- pact of Chaos on Science and Society, eds. Grebogi, C. Yorke, J. A. (United Nations University Press, pects of topology that are necessary for a proper Tokyo), pp. 384–386. understanding of the concepts introduced in this Devaney, R. L. [1989] An Introduction to Chaotic Dy- work. namical Systems (Addison-Wesley, CA). Falconer, K. [1990] Fractal Geometry (John Wiley, Chichester). A.1. Convergence in Topological Feigenbaum, M. [1992] “Foreword,” Chaos and Fractals: Spaces: Sequence, Net and New Frontiers of Science (Springer-Verlag, NY), Filter pp. 1–7. Gallagher, R. Appenzeller, T. [1999] “Beyond reduc- In the theory of convergence in topological spaces, tionism,” Science 284, p. 79. countability plays an important role. To understand Gleick, J. [1987] Chaos: The Amazing Science of the Un- the significance of this concept, some preliminaries predictable (Viking, NY). are needed.
  • 62. 3208 A. Sengupta The notion of a basis, or base, is a familiar one determines reciprocally the topology U as in analysis: a base is a subcollection of a set which   may be used to construct, in a specified manner, any   U = U ⊆X :U = B . (A.4) element of the set. This simplifies the statement of   B∈T B a problem since a smaller number of elements of the base can be used to generate the larger class This means that the topology on X can be recon- of every element of the set. This philosophy finds structed from the base by taking all possible unions application in topological spaces as follows. of members of the base, and a collection of subsets Among the three properties (N1) − (N3) of the of a set X is a topological base iff Eq. (A.4) of arbi- neighborhood system Nx of Tutorial 4, (N1) and trary unions of elements of T B generates a topology (N2) are basic in the sense that the resulting sub- on X. This topology, which is the coarsest (that is collection of Nx can be used to generate the full the smallest) that contains T B, is obviously closed system by applying (N3); this basic neighborhood under finite intersections. Since the open set Int(N ) system, or neighborhood (local ) base B x at x, is char- is a neighborhood of x whenever N is, Eq. (A.2) acterized by and the definition Eq. (17) of Nx implies that the (NB1) x belongs to each member B of Bx . open neighborhood system of any point in a topo- logical space is an example of a neighborhood base (NB2) The intersection of any two members of B x at that point, an observation that has often led, to- contains another member of Bx : B1 , B2 ∈ Bx ⇒ gether with Eq. (A.3), to the use of the term “neigh- (∃B ∈ Bx : B ⊆ B1 B2 ). borhood” as a synonym for “non-empty open set”. The distinction between the two however is signifi- Formally, compare Eq. (18), cant as neighborhoods need not necessarily be open Definition A.1.1. A neighborhood (local) base B x sets; thus while not necessary, it is clearly sufficient at x in a topological space (X, U) is a subcollection for the local basic sets B to be open in Eqs. (A.1) of the neighborhood system Nx having the prop- and (A.2). If Eq. (A.2) holds for every x ∈ N , then erty that each N ∈ Nx contains some member of the resulting Nx reduces to the topology induced by Bx . Thus the open basic neighborhood system B x as given by Eq. (18). def Bx = {B ∈ Nx : x ∈ B ⊆ N for each N ∈ Nx } In order to check if a collection of subsets T B (A.1) of X qualifies to be a basis, it is not necessary to verify properties (T1)–(T3) of Tutorial 4 for the determines the full neighborhood system class (A.4) generated by it because of the proper- Nx = {N ⊆ X : x ∈ B ⊆ N for some B ∈ Bx } ties (TB1) and (TB2) below whose strong affinity to (A.2) (NB1) and (NB2) is formalized in Theorem A.1.1. reciprocally as all supersets of the basic elements. Theorem A.1.1. A collection TB of subsets of X is a topological basis on X iff The entire neighborhood system Nx , which is (TB1) X = B ∈T B B. Thus each x ∈ X must be- recovered from the base by forming all supersets of long to some B ∈ T B which implies the existence the basic neighborhoods, is trivially a local base at of a local base at each point x ∈ X. x; non-trivial examples are given below. The second example of a base, consisting as (TB2) The intersection of any two members B 1 and usual of a subcollection of a given collection, is the B2 of T B with x ∈ B1 B2 contains another mem- topological base T B that allows the specification of ber of T B: (B1 , B2 ∈ T B) ∧ (x ∈ B1 B2 ) ⇒ (∃B ∈ the topology on a set X in terms of a smaller col- T B : x ∈ B ⊆ B1 B2 ). lection of open sets. This theorem, together with Eq. (A.4) ensures Definition A.1.2. A base T B in a topological space that a given collection of subsets of a set X sat- (X, U) is a subcollection of the topology U having isfying (TB1) and (TB2) induces some topology the property that each U ∈ U contains some mem- on X; compared to this is the result that any ber of T B. Thus collection of subsets of a set X is a subbasis for some topology on X. If X, however, already def TB = {B ∈ U : B ⊆ U for each U ∈ U} (A.3) has a topology U imposed on it, then Eq. (A.3)
  • 63. Toward a Theory of Chaos 3209 must also be satisfied in order that the topol- R2 . Of course, the entire neighborhood system at ogy generated by T B is indeed U. The next the- any point of a topological space is itself a (less use- orem connects the two types of bases of Defini- ful) local base at that point. By Theorem A.1.2, tions A.1.1 and A.1.2 by asserting that although Bε (x; d), Dε (x; d), ε 0, Bq (x; d), Q q 0 and a local base of a space need not consist of open B1/n (x; d), n ∈ Z+ , for all x ∈ X are examples of sets and a topological base need not have any ref- bases in a metrizable space with topology induced erence to a point of X, any subcollection of the by a metric d. base containing a point is a local base at that point. In terms of local bases and bases, it is now pos- sible to formulate the notions of first and second Theorem A.1.2. A collection of open sets T B is countability as follows. a base for a topological space (X, U) iff for each x ∈ X, the subcollection Definition A.1.3. A topological space is first countable if each x ∈ X has some countable neigh- Bx = {B ∈ U : x ∈ B ∈ T B} (A.5) borhood base, and is second countable if it has a of basic sets containing x is a local base at x. countable base. Every metrizable space (X, d) is first countable Proof. Necessity. Let TB be a base of (X, U) and as both {B(x, q)}Q q0 and {B(x, 1/n)}n∈Z+ are N be a neighborhood of x, so that x ∈ U ⊆ N for examples of countable neighborhood bases at any some open set U = B ∈ T B B and basic open sets x ∈ (X, d); hence Rn is first countable. It should be B. Hence x ∈ B ⊆ N shows, from Eq. (A.1), that clear that although every second countable space B ∈ Bx is a local basic set at x. is first countable, only a countable first countable Sufficiency. If U is an open set of X contain- space can be second countable, and a common ex- ing x, then the definition of local base Eq. (A.1) ample of an uncountable first countable space that requires x ∈ Bx ⊆ U for some subcollection of basic is also second countable is provided by R n . Metriz- sets Bx in Bx ; hence U = x∈U Bx . By Eq. (A.4) able spaces need not be second countable: any un- therefore, T B is a topological base for X. countable set having the discrete topology is as an example. Because the basic sets are open, (TB2) of Theorem A.1.1 leads to the following physically Example A.1.2. The following is an important ex- appealing paraphrase of Theorem A.1.2. ample of a space that is not first countable as it is needed for our pointwise biconvergence of Sec. 3. Corollary. A collection T B of open sets of (X, U) Let Map(X, Y ) be the set of all functions between is a topological base that generates U iff for each the uncountable spaces (X, U) and (Y, V). Given open set U of X and each x ∈ U there is an open any integer I ≥ 1, and any finite collection of points set B ∈ T B such that x ∈ B ⊆ U ; that is iff (xi )I of X and of open sets (Vi )I in Y , let i=1 i=1 x ∈ U ∈ U ⇒ (∃B ∈ T B : x ∈ B ⊆ U ) . B((xi )I ; (Vi )I ) = {g ∈ Map(X, Y ) : i=1 i=1 Example A.1.1. Some examples of local bases in R (g(xi ) ∈ Vi )(i = 1, 2, . . . , I)} are intervals of the type (x−ε, x+ε), [x−ε, x+ε] for (A.6) real ε, (x−q, x+q) for rational q, (x−1/n, x+1/n) for n ∈ Z+ , while for a metrizable space with the be the functions in Map(X, Y ) whose graphs pass topology induced by a metric d, each of the follow- through each of the sets (Vi )I I i=1 at (xi )i=1 , and ing is a local base at x ∈ X: Bε (x; d) := {y ∈ X : let T B be the collection of all such subsets of d(x, y) ε} and Dε (x; d) := {y ∈ X : d(x, y) ≤ ε} Map(X, Y ) for every choice of I, (xi )I , and i=1 for ε 0, Bq (x; d) for Q q 0 and B1/n (x; d) (Vi )I . The existence of a unique topology T — the i=1 for n ∈ Z+ . In R2 , two neighborhood bases at any topology of pointwise convergence on Map(X, Y ) — x ∈ R2 are the disks centered at x and the set that is generated by the open sets B of the collec- of all squares at x with sides parallel to the axes. tion T B now follows because Although these bases have no elements in common, (TB1) is satisfied: For any f ∈ Map(X, Y ) there they are nevertheless equivalent in the sense that must be some x ∈ X and a corresponding V ⊆ Y they both generate the same (usual) topology in such that f (x) ∈ V , and
  • 64. 3210 A. Sengupta (TB2) is satisfied because the space (X, U) is not first countable (and as seen above this is not a rare situation), it is not diffi- B((si )I ; (Vi )I ) i=1 i=1 B((tj )J ; (Wj )J ) j=1 j=1 cult to realize that sequences are inadequate to de- = B((si )I , (tj )J ; (Vi )I , (Wj )J ) i=1 j=1 i=1 j=1 scribe convergence in X simply because it can have only countably many values whereas the space may implies that a function simultaneously belonging to require uncountably many neighborhoods to com- the two open sets on the left must pass through each pletely define the neighborhood system at a point. of the points defining the open set on the right. The resulting uncountable generalizations of a se- We now demonstrate that (Map(X, Y ), T ) is quence in the form of nets and filters is achieved not first countable by verifying that it is not through a corresponding generalization of the index possible to have a countable local base at any set N to the directed set D. f ∈ Map(X, Y ). If this is not indeed true, let Bf ((xi )I ; (Vi )I ) = {g ∈ Map(X, Y ) : (g(xi ) ∈ I Definition A.1.4. A directed set D is a preordered i=1 i=1 I Vi )i=1 }, which denotes those members of T B that set for which the order , known as a direction of contain f with Vi an open neighborhood of f (xi ) D, satisfies in Y , be a countable local base at f , see Theo- (a) α ∈ D ⇒ α α (that is is reflexive). rem A.1.2. Since X is uncountable, it is now pos- (b) α, β, γ ∈ D such that (α β ∧β γ) ⇒ α γ sible to choose some x∗ ∈ X different from any of (that is is transitive). the (xi )I (for example, let x∗ ∈ R be an irrational i=1 (c) α, β ∈ D ⇒ ∃γ ∈ D such that (α γ ∧ β γ). for rational (xi )I ), and let f (x∗ ) ∈ V ∗ where V ∗ i is an open neighborhood of f (x∗ ). Then B(x∗ ; V ∗ ) While the first two properties are obvious enough, is an open set in Map(X, Y ) containing f ; hence the third which replaces antisymmetry, ensures that from the definition of the local base, Eq. (A.1), or for any finite number of elements of the directed set, equivalently from the Corollary to Theorem A.1.2, there is always a successor (upper bound). Exam- there exists some (countable) I ∈ N such that ples of directed sets can be both straight forward, f ∈ B I ⊆ B(x∗ ; V ∗ ). However, as any totally ordered set like N, R, Q, or Z and all subsets of a set X under the superset or subset  yi ∈ V i , if x = xi , and 1 ≤ i ≤ I  relation (that is (P(X), ⊇) or (P(X), ⊆) that are ∗ f (x) = y∗ ∈ V ∗ , if x = x∗ directed by their usual ordering, and not quite so obvious as the following examples which are signifi-  arbitrary, otherwise cantly useful in dealing with convergence questions is a simple example of a function on X that is in B I in topological spaces, amply illustrate. (as it is immaterial as to what values the function The neighborhood system takes at points other than those defining B I ), but not in B(x∗ ; V ∗ ). From this it follows that a suffi- DN = {N : N ∈ Nx } cient condition for the topology of pointwise conver- at a point x ∈ X, directed by the reverse inclusion gence to be first countable is that X be countable. direction defined as Even though it is not first countable, M N ⇔N ⊆M for M, N ∈ Nx , (A.7) (Map(X, Y ), T ) is a Hausdorff space when Y is is a fundamental example of a natural direction of Hausdorff. Indeed, if f , g ∈ (Map(X, Y ), T ) with Nx . In fact while reflexivity and transitivity are f = g, then f (x) = g(x) for some x ∈ X. But clearly obvious, (c) follows because for any M, N ∈ then as Y is Hausdorff, it is possible to choose dis- Nx , M M N and N M N . Of course, this joint open intervals Vf and Vg at f (x) and g(x) direction is not a total ordering on N x . A more nat- respectively. urally useful directed set in convergence theory is With this background on first and second countability, it is now possible to go back to the D Nt = {(N, t) : (N ∈ Nx )(t ∈ N )} (A.8) question of nets, filters and sequences. Technically, under its natural direction a sequence on a set X is a map x : N → X from the (M, s) (N, t) ⇔ N ⊆ M for M, N ∈ Nx ; set of natural numbers to X; instead of denoting this is in the usual functional manner of x(i) with (A.9) i ∈ N, it is the standard practice to use the nota- D Nt is more useful than D N because, unlike the tion (xi )i∈N for the terms of a sequence. However, if latter, D Nt does not require a simultaneous choice
  • 65. Toward a Theory of Chaos 3211 of points from every N ∈ Nx that implicitly in- Definition A.1.7. A net χ : D → X converges to volves a simultaneous application of the Axiom of x ∈ X if it is eventually in every neighborhood of Choice; see Example A.1.3 below. The general in- x, that is dexed variation (∀N ∈ Nx )(∃µ ∈ D)(χ(ν µ) ∈ N ) . D Nβ = {(N, β) : (N ∈ Nx )(β ∈ D)(xβ ∈ N )} The point x is known as the limit of χ and the col- (A.10) lection of all limits of a net is the limit set of Eq. (A.8), with natural direction lim(χ) = {x ∈ X : (∀N ∈ Nx )(∃Rβ ∈ Res(D)) (M, α) ≤ (N, β) ⇔ (α β) ∧ (N ⊆ M ) , (A.11) (χ(Rβ ) ⊆ N )} (A.12) often proves useful in applications as will be clear of χ, with the set of residuals Res(D) in D given by from the proofs of Theorems A.1.3 and A.1.4. Res(D) = {Rα ∈ P(D) : Rα = {β ∈ D Definition A.1.5 (Net). Let X be any set and D for all β α ∈ D}} . (A.13) a directed set. A net χ : D → X in X is a function The net adheres at x ∈ X 27 if it is frequently in on the directed set D with values in X. every neighborhood of x, that is A net, to be denoted as χ(α), α ∈ D, is there- ((∀N ∈ Nx )(∀µ ∈ D))((∃ν µ) : χ(ν) ∈ N ) . fore a function indexed by a directed set. We adopt The point x is known as the adherent of χ and the the convention of denoting nets in the manner of collection of all adherents of χ is the adherent set functions and do not use the sequential notation χ α of the net, which may be expressed in terms of the that can also be found in the literature. Thus, while cofinal subset of D every sequence is a special type of net, χ : Z → X is an example of a net that is not a sequence. Cof(D) = {Cα ∈ P(D) : Cα = {β ∈ D Convergence of sequences and nets are de- for some β α ∈ D}} (A.14) scribed most conveniently in terms of the notions of (thus Dα is cofinal in D iff it intersects every residual being eventually in and frequently in every neigh- in D), as borhood of points. We describe these concepts in terms of nets which apply to sequences with obvi- adh(χ) = {x ∈ X : (∀N ∈ Nx )(∃Cβ ∈ Cof(D)) ous modifications. (χ(Cβ ) ⊆ N )}. (A.15) Definition A.1.6. A net χ : D → X is said to be This recognizes, in keeping with the limit set, each subnet of a net to be a net in its own right, and is (a) Eventually in a subset A of X if its tail is even- equivalent to tually in A: (∃β ∈ D) : (∀γ β)(χ(γ) ∈ A). (b) Frequently in a subset A of X if for any index adh(χ) = {x ∈ X : (∀N ∈ Nx )(∀Rα ∈ Res(D)) β ∈ D, there is a successor index γ ∈ D such (χ(Rα ) N = ∅)} . (A.16) that χ(γ) is in A: (∀β ∈ D)(∃γ β) : (χ(γ) ∈ A). Intuitively, a sequence is eventually in a set A if it is always in it after a finite number of terms (of It is not difficult to appreciate that course, the concept of a finite number of terms is unavailable for nets; in this case the situation may (i) A net eventually in a subset is also frequently be described by saying that a net is eventually in A in it but not conversely, if its tail is in A) and it is frequently in A if it always (ii) A net eventually (respectively, frequently) in a returns to A to leave it again. It can be shown that subset cannot be frequently (respectively, even- a net is eventually (resp. frequently) in a set iff it is tually) in its complement. not frequently (resp. eventually) in its complement. With these notions of eventually in and fre- The following examples illustrate graphically quently in, convergence characteristics of a net may the role of a proper choice of the index set D in be expressed as follows. the description of convergence. 27 This is also known as a cluster point; we shall, however, use this new term exclusively in the sense of the elements of a derived set, see Definition 2.3.
  • 66. 3212 A. Sengupta Example A.1.3. (1) Let γ ∈ D. The eventually to yield a self-consistent tool for the description of constant net χ(δ) = x for δ γ converges to x. convergence. (2) Let Nx be a neighborhood system at a point x in As compared with sequences where, the index X and suppose that the net (χ(N ))N ∈Nx is defined set is restricted to positive integers, the considerable by freedom in the choice of directed sets as is abun- dantly borne out by the two preceding examples, def χ(M ) = s ∈ M ; (A.17) is not without its associated drawbacks. Thus as here the directed index set D N is ordered by the a trade-off, the wide range of choice of the directed natural direction (A.7) of Nx . Then χ(N ) → x be- sets may imply that induction methods, so common cause given any x-neighborhood M ∈ D N , it follows in the analysis of sequences, need no longer apply from to arbitrary nets. (4) The non-convergent nets (actually these are M N ∈ D N ⇒ χ(N ) = t ∈ N ⊆ M (A.18) sequences) that a point in any subset of M is also in M ; χ(N ) (a) (1, −1, 1, −1, . . .) adheres at 1 and −1 and is therefore eventually in every neighborhood of x. n, if n is odd (3) This slightly more general form of the previous (b) xn = 1 − 1/(1 + n), if n is even example provides a link between the complimentary concepts of nets and filters that is considered below. adheres at 1 for its even terms, but is unbounded For a point x ∈ X, and M, N ∈ Nx with the corre- in the odd terms. sponding directed set D Ms of Eq. (A.8) ordered by A converging sequence or net is also adhering its natural order (A.9), the net but, as examples (4) show, the converse is false. def Nevertheless it is true, as again is evident from ex- χ(M, s) = s (A.19) amples (4), that in a first countable space where converges to x because, as in the previous example, sequences suffice, a sequence (xn ) adheres to x iff for any given (M, s) ∈ D Ns , it follows from some subsequence (xnm )m∈N of (xn ) converges to x. (M, s) (N, t) ∈ D Ms ⇒ χ(N, t) = t ∈ N ⊆ M If the space is not first countable this has a corre- (A.20) sponding equivalent formulation for nets with sub- nets replacing subsequences as follows. that χ(N, t) is eventually in every neighborhood Let (χ(α))α∈D be a net. A subnet of χ(α) is M of x. The significance of the directed set D Nt the net ζ(β) = χ(σ(β)), β ∈ E, where σ : (E, ≤) → of Eq. (A.8), as compared to D N , is evident from (D, ) is a function that captures the essence of the the net that it induces without using the Axiom of subsequential mapping n → nm in N by satisfying Choice: For a subset A of X, the net χ(N, t) = t ∈ A indexed by the directed set (SN1) σ is an increasing order-preserving function: it respects the order of E: σ(β) σ(β ) for every D Nt = {(N, t) : (N ∈ Nx )(t ∈ N A)} (A.21) β ≤ β ∈ E, and under the direction of Eq. (A.9), converges to x ∈ X (SN2) For every α ∈ D there exists a β ∈ E such with all such x defining the closure Cl(A) of A. Fur- that α σ(β). thermore taking the directed set to be These generalize the essential properties of a subse- D Nt = {(N, t) : (N ∈ Nx )(t ∈ N A − {x})} quence in the sense that (1) Even though the index sets D and E may be different, it is necessary that (A.22) the values of E be contained in D, and (2) There which, unlike Eq. (A.21), excludes the point x that are arbitrarily large α ∈ D such that χ(α = σ(β)) may or may not be in the subset A of X, induces is a value of the subnet ζ(β) for some β ∈ E. Re- the net χ(N, t) = t ∈ A − {x} converging to x ∈ X, calling the first of the order relations Eq. (38) on with the set of all such x yielding the derived set Map(X, Y ), we will denote a subnet ζ of χ by ζ χ. Der(A) of A. In contrast, Eq. (A.21) also includes We now consider the concept of filter on a set the isolated points t = x of A so as to generate X that is very useful in visualizing the behavior its closure. Observe how neighborhoods of a point, of sequences and nets, and in fact filters constitute which define convergence of nets and filters in a an alternate way of looking at convergence ques- topological space X, double up here as index sets tions in topological spaces. A filter F on a set X
  • 67. Toward a Theory of Chaos 3213 is a collection of nonempty subsets of X satisfying space and F a filter on X. Then properties (F1) − (F3) below that are simply those lim(F) = {x ∈ X : (∀N ∈ Nx )(∃F ∈ F)(F ⊆ N )} of a neighborhood system Nx without specification of the reference point x. (A.23) (F1) The empty set ∅ does not belong to F, and (F2) The intersection of any two members of a fil- adh(F) = {x ∈ X : (∀N ∈ Nx )(∀F ∈ F) ter is another member of the filter: F 1 , F2 ∈ (F N = ∅)} (A.24) F ⇒ F1 F2 ∈ F, (F3) Every superset of a member of a filter belongs are respectively the sets of limit points and adherent to the filter: (F ∈ F) ∧ (F ⊆ G) ⇒ G ∈ F; in points of F 28 particular X ∈ F. A comparison of Eqs. (A.12) and (A.16) with Example A.1.4 Eqs. (A.23) and (A.24) respectively demonstrate their formal similarity; this inter-relation between (1) The indiscrete filter is the smallest filter on X. filters and nets will be made precise in Defini- (2) The neighborhood system Nx is the important tions A.1.10 and A.1.11 below. It should be clear neighborhood filter at x on X, and any local from the preceding two equations that base at x is also a filter-base for Nx . In general for any subset A of X, {N ⊆ X : A ⊆ Int(N )} lim(F) ⊆ adh(F) , (A.26) is a filter on X at A. with a similar result (3) All subsets of X containing a point x ∈ X is the principal filter F P(x) on X at x. More gener- lim(χ) ⊆ adh(χ) (A.27) ally, if F consists of all supersets of a nonempty holding for nets because of the duality between nets subset A of X, then F is the principal filter and filters as displayed by Definitions A.1.9 and F P(A) = {N ⊆ X : A ⊆ Int(N )} at A. By A.1.10 below, with the equality in Eqs. (A.26) and adjoining the empty set to this filter give the (A.27) being true (but not characterizing) for ultra- p-inclusion and A-inclusion topologies on X, re- filters and ultranets respectively, see Example 4.2(3) spectively. The single element sets {{x}} and for an account of this notion. It should be clear from {A} are particularly simple examples of filter- the equations of Definition A.1.8 that bases that generate the principal filters at x adh(F) = {x ∈ X : (∃ a finer filter G ⊇ F on X) and A. (4) For an uncountable (resp. infinite) set X, all (G → x)} (A.28) cocountable (resp. cofinite) subsets of X consti- consists of all the points of X to which some finer tute the cocountable (resp. cofinite or Frechet) filter G (in the sense that F ⊆ G implies every ele- filter on X. Again, adding to these filters the ment of F is also in G) converges in X; thus empty set gives the respective topologies. adh(F) = lim(G : G ⊇ F) , Like the topological and local bases T B and Bx respectively, a subclass of F may be used to define which corresponds to the net-result of Theo- a filter-base F B that in turn generate F on X, just rem A.1.5 below, that a net χ adheres to x iff there as it is possible to define the concepts of limit and is some subnet of χ that converges to x in X. Thus adherence sets for a filter to parallel those for nets if ζ χ is a subnet of χ and F ⊆ G is a filter coarser that follow straightforwardly from Definition A.1.7, than G then taken with Definition A.1.11. lim(χ) ⊆ lim(ζ) lim(F) ⊆ lim(G) Definition A.1.8. Let (X, T ) be a topological adh(ζ) ⊆ adh(χ) adh(G) ⊆ adh(F) ; 28 The restatement F → x ⇔ Nx ⊆ F (A.25) of Eq. (A.23) that follows from (F3), and sometimes taken as the definition of convergence of a filter, is significant as it ties up the algebraic filter with the topological neighborhood system to produce the filter theory of convergence in topological spaces. From the defining properties of F it follows that for each x ∈ X, Nx is the coarsest (that is smallest) filter on X that converges to x.
  • 68. 3214 A. Sengupta a filter G finer than a given filter F corresponds supersets F SΣ∧ . F(F S) :=F SΣ∧ is the smallest fil- to a subnet ζ of a given net χ. The implication of ter on X that contains F S and is the filter generated this correspondence should be clear from the asso- by F S. ciation between nets and filters contained in Defini- Equation (A.24) can be put in the more useful tions A.1.10 and A.1.11. and transparent form given by A filter-base in X is a non-empty family Theorem A.1.3. For a filter F in a space (X, T ) (Bα )α∈D = F B of subsets of X characterized by (FB1) There are no empty sets in the collection F B: adh(F) = Cl(F ) (∀α ∈ D)(Bα = ∅) F ∈F (A.31) (FB2) The intersection of any two members of F B = Cl(B) , contains another member of F B: Bα , Bβ ∈ F B ⇒ B ∈ FB (∃B ∈ F B : B ⊆ Bα Bβ ); and dually adh(χ), are closed sets. hence any class of subsets of X that does not con- Proof. Follows immediately from the definitions for tain the empty set and is closed under finite inter- the closure of a set Eq. (20) and the adherence of a sections is a base for a unique filter on X; compare filter Eq. (A.24). As always, it is a matter of conve- the properties (NB1) and (NB2) of a local basis nience in using the basic filters F B instead of F to given at the beginning of this Appendix. Similar to generate the adherence set. Definition A.1.1 for the local base, it is possible to define It is in fact true that the limit sets lim(F) and lim(χ) are also closed set of X; the arguments in- Definition A.1.9. A filter-base FB in a set X is a volving ultrafilters are omitted. subcollection of the filter F on X having the prop- Similar to the notion of the adherence set of erty that each F ∈ F contains some member of F B. a filter is its core — a concept that unlike the ad- Thus herence, is purely set-theoretic being the infimum def FB = {B ∈ F : B ⊆ F for each F ∈ F} (A.29) of the filter and is not linked with any topological structure of the underlying (infinite) set X — de- determines the filter fined as F = {F ⊆ X : B ⊆ F for some B ∈ F B} (A.30) core(F) = F. (A.32) F ∈F reciprocally as all supersets of the basic elements. From Theorem A.1.3 and the fact that the closure This is the smallest filter on X that contains F B and of a set A is the smallest closed set that contains A, is said to be the filter generated by its filter-base F B; see Eq. (25) at the end of Tutorial 4, it is clear that alternatively F B is the filter-base of F. The entire in terms of filters neighborhood system Nx , the local base Bx , Nx A A = core(F P(A)) for x ∈ Cl(A), and the set of all residuals of a di- Cl(A) = adh(F P(A)) (A.33) rected set D are among the most useful examples of filter-bases on X, A and D respectively. Of course, = core(Cl(F P(A))) every filter is trivially a filter-base of itself, and the where F P(A) is the principal filter at A; thus the singletons {{x}}, {A} are filter-bases that generate core and adherence sets of the principal filter at A the principal filters F P(x) and F P(A) at x, and A are equal respectively to A and Cl(A) — a classic ex- respectively. ample of equality in the general relation Cl( Aα ) ⊆ Paralleling the case of topological subbase T S, Cl(Aα ) — but both are empty, for example, in a filter subbase F S can be defined on X to be any the case of an infinitely decreasing family of ratio- collection of subsets of X with the finite intersection nals centered at any irrational (leading to a princi- property (as compared with T S where no such condi- pal filter-base of rationals at the chosen irrational). tion was necessary, this represents the fundamental This is an important example demonstrating that point of departure between topology and filter) and the infinite intersection of a non-empty family of it is not difficult to deduce that the filter generated (closed ) sets with the finite intersection property by F S on X is obtained by taking all finite inter- may be empty, a situation that cannot arise on sections F S∧ of members of F S followed by their a finite set or an infinite compact set. Filters on
  • 69. Toward a Theory of Chaos 3215 X with an empty core are said to be free, and (ii) χ is frequently in A ⇒ (∀Rα ∈ Res(D)) are fixed otherwise: notice that by its very defini- (A χ(Rα ) = ∅) ⇒ A Fχ = ∅. tion filters cannot be free on a finite set, and a free filter represents an additional feature that may Limits and adherences are obviously preserved in arise in passing from finite to infinite sets. Clearly switching between nets (respectively, filters) and (adh(F) = ∅) ⇒ (core(F) = ∅), but as the im- the filters (respectively, nets) that they generate: portant example of the rational space in the reals lim(χ) = lim(Fχ ), adh(χ) = adh(Fχ ) (A.34) illustrate, the converse need not be true. Another example of a free filter of the same type is provided lim(F) = lim(χF ), adh(F) = adh(χF ) . (A.35) by the filter-base {[a, ∞) : a ∈ R} in R. Both these The proofs of the two parts of Eq. (A.34), for ex- examples illustrate the important property that a ample, go respectively as follows. x ∈ lim(χ) ⇔ χ is filter is free iff it contains the cofinite filter, and the eventually in Nx ⇔ (∀N ∈ Nx )(∃F ∈ Fχ ) such that cofinite filter is the smallest possible free filter on an (F ⊆ N ) ⇔ x ∈ lim(Fχ ), and x ∈ adh(χ) ⇔ χ is infinite set. The free cofinite filter, as these examples frequently in Nx ⇔ (∀N ∈ Nx )(∀F ∈ Fχ )(N F = illustrate, may be typically generated as follows. Let ∅) ⇔ x ∈ adh(Fχ ); here F is a superset of χ(Rα ). A be a subset of X, x ∈ Bdy X−A (A), and consider Some examples of convergence of filters are the directed set Eq. (A.21) to generate the corre- sponding net in A given by χ(N ∈ Nx , t) = t ∈ A. (1) Any filter on an indiscrete space X converges Quite clearly, the core of any Frechet filter based on to every point of X. this net must be empty as the point x does not lie (2) Any filter on a space that coincides with its in A. In general, the intersection is empty because topology (minus the empty set, of course) con- if it were not so then the complement of the inter- verges to every point of the space. section — which is an element of the filter — would (3) For each x ∈ X, the neighborhood filter N x be infinite in contravention of the hypothesis that converges to x; this is the smallest filter on X the filter is Frechet. It should be clear that every fil- that converges to x. ter finer than a free filter is also free, and any filter (4) The indiscrete filter F = {X} converges to no coarser than a fixed filter is fixed. point in the space (X, {∅, A, X − A, X}), but Nets and filters are complimentary concepts converges to every point of X − A if X has the and one may switch from one to the other as fol- topology {∅, A, X} because the only neighbor- lows. hood of any point in X − A is X which is con- tained in the filter. Definition A.1.10. Let F be a filter on X and let D Fx = {(F, x) : (F ∈ F)(x ∈ F )} be a directed set One of the most significant consequences of con- with its natural direction (F, x) (G, y) ⇒ (G ⊆ vergence theory of sequences and nets, as shown by F ). The net χF : D Fx → X defined by the two theorems and the corollary following, is that this can be used to describe the topology of a set. χF (F, x) = x The proofs of the theorems also illustrate the close is said to be associated with the filter F, see inter-relationship between nets and filters. Eq. (A.20). Theorem A.1.4. For a subset A of a topological Definition A.1.11. Let χ : D → X be a net and space X, Rα = {β ∈ D : β α ∈ D} a residual in D. Then Cl(A) = {x ∈ X : (∃ a net χ in A)(χ → x)} . def F Bχ = {χ(Rα ) : Res(D) → X for all α ∈ D} (A.36) is the filter-base associated with χ, and the corre- Proof. Necessity. For x ∈ Cl(A), construct a net sponding filter Fχ obtained by taking all supersets χ → x in A as follows. Let Bx be a topological local of the elements of F Bχ is the filter associated with χ. base at x, which by definition is the collection of all F Bχ is a filter-base in X because χ( Rα ) ⊆ open sets of X containing x. For each β ∈ D, the χ(Rα ), that holds for any functional relation, sets proves (FB2). It is not difficult to verify that Nβ = {Bα : Bα ∈ Bx } (i) χ is eventually in A ⇒ A ∈ Fχ , and α β
  • 70. 3216 A. Sengupta form a nested decreasing local neighborhood fil- Theorem A.1.5. If χ is a net in a topological space ter base at x. With respect to the directed set X, then x ∈ adh(χ) iff some subnet ζ(β) = χ(σ(β)) D Nβ = {(Nβ , β) : (β ∈ D)(xβ ∈ Nβ )} of Eq. (A.10), of χ(α), with α ∈ D and β ∈ E, converges in X to define the desired net in A by x; thus χ(Nβ , β) = xβ ∈ Nβ A adh(χ) = {x ∈ X : (∃ a subnet ζ χ in X)(ζ → x)}. (A.39) where the family of non-empty decreasing subsets Nβ A of X constitute the filter-base in A as re- Proof. Necessity. Let x ∈ adh(χ). Define a subnet quired by the directed set D Nβ . It now follows from function σ :D Nα → D by σ(Nα , α) = α where D Nα Eq. (A.11) and the arguments in Example A.1.3(3) is the directed set of Eq. (A.10): (SN1) and (SN2) that xβ → x; compare the directed set of Eq. (A.21) are quite evidently satisfied according to Eq. (A.11). for a more compact, yet essentially identical, argu- Proceeding as in the proof of the preceding theorem ment. Carefully observe the dual roles of N x as a it follows that xβ = χ(σ(Nα , α)) = ζ(Nα , α) → x neighborhood filter base at x. is the required converging subnet that exists from Sufficiency. Let χ be a net in A that con- Eq. (A.15) and the fact that χ(Rα ) Nα = ∅ for verges to x ∈ X. For any Nα ∈ Nx , there is a every Nα ∈ Nx , by hypothesis. Rα ∈ Res(D) of Eq. (A.13) such that χ(Rα ) ⊆ Nα . Sufficiency. Assume now that χ has a subnet Hence the point χ(α) = xα of A belongs to Nα so ζ(Nα , α) that converges to x. If χ does not adhere that A Nα = ∅ which means, from Eq. (20), that at x, there is a neighborhood Nα of x not frequented x ∈ Cl(A). by it, in which case χ must be eventually in X −N α . Then ζ(Nα , α) is also eventually in X − Nα so that Corollary. Together with Eqs. (20) and (22), it fol- ζ cannot be eventually in Nα , a contradiction of the lows that hypothesis that ζ(Nα , α) → x.29 Der(A) = {x ∈ X : (∃ a net ζ in A − {x})(ζ → x)} Equations (A.36) and (A.39) imply that the clo- (A.37) sure of a subset A of X is the class of X-adherences of all the (sub)nets of X that are eventually in A. The filter forms of Eqs. (A.36) and (A.37) This includes both the constant nets yielding the Cl(A) = {x ∈ X : (∃ a filter F on X) isolated points of A and the non-constant nets lead- (A ∈ F)(F → x)} ing to the cluster points of A, and implies the fol- (A.38) lowing physically useful relationship between con- Der(A) = {x ∈ X : (∃ a filter F on X) vergence and topology that can be used as defining (A − {x} ∈ F)(F → x)} criteria for open and closed sets having a more ap- pealing physical significance than the original def- then follows from Eq. (A.25) and the finite inter- initions of these terms. Clearly, the term “net” is section property (F2) of F so that every neighbor- hood of x must intersect A (respectively A − {x}) in justifiably used here to include the subnets too. Eq. (A.38) to produce the converging net needed in The following corollary of Theorem A.1.5 sum- the proof of Theorem A.1.3. marizes the basic topological properties of sets in terms of nets (respectively, filters). We end this discussion of convergence in topo- Corollary. Let A be a subset of a topological space logical spaces with a proof of the following theorem X. Then which demonstrates the relationship that “eventu- ally in” and “frequently in” bears with each other; (1) A is closed in X iff every convergent net of Eq. (A.39) below is the net-counterpart of the filter X that is eventually in A actually converges equation (A.28). to a point in A (respectively, iff the adhering 29 In a first countable space, while the corresponding proof of the first part of the theorem for sequences is essentially the same as in the present case, the more direct proof of the converse illustrates how the convenience of nets and directed sets may require more general arguments. Thus if a sequence (xi )i∈N has a subsequence (xik )k∈N converging to x, then a more direct line of reasoning proceeds as follows. Since the subsequence converges to x, its tail (xik )k≥j must be in every neighborhood N of x. But as the number of such terms is infinite whereas {ik : k j} is only finite, it is necessary that for any given n ∈ N, cofinitely many elements of the sequence (xik )ik ≥n be in N . Hence x ∈ adh((xi )i∈N ).
  • 71. Toward a Theory of Chaos 3217 points of each filter-base on A all belong to A). to x unless it is of the uncountable type 30 Thus no X-convergent net in a closed subset may converge to a point outside it. (x0 , x1 , . . . , xI , xI+1 , xI+1 , . . .) (A.40) (2) A is open in X iff every convergent net of X with only a finite number I of distinct terms ac- that converges to a point in A is eventually in tually belonging to the closed sequential set F = A. Thus no X-convergent net outside an open X −G, and xI+1 = x. Note that as we are concerned subset may converge to a point in the set. only with the eventual behavior of the sequence, we (3) A is closed-and-open (clopen) in X iff every may discard all distinct terms from G by consider- convergent net of X that converges in A is even- ing them to be in F , and retain only the constant tually in A and conversely. sequence (x, x, . . .) in G. In comparison with the (4) x ∈ Der(A) iff some net (respectively, filter- cofinite case that was considered in Sec. 4, the en- base) in A − {x} converges to x; this clearly tire countably infinite sequence can now lie outside eliminates the isolated points of A and x ∈ a neighborhood of x thereby enforcing the eventual Cl(A) iff some net (respectively, filter-base) in constancy of the sequence. This leads to a gener- A converges to x. alization of our earlier cofinite result in the sense that a cocountable filter on a cocountable space con- Remark. The differences in these characterizations verges to every point in the space. should be fully appreciated: If we consider the clus- It is now straightforward to verify that for a ter points Der(A) of a net χ in A as the resource point x0 in an uncountable cocountable space X generated by χ, then a closed subset of X can be considered to be selfish as it keeps all it resource (a) Even though no sequence in the open set G = to itself: Der(A) ∩ A = Der(A). The opposite of X − {x0 } can converge to x0 , yet x0 ∈ Cl(G) this is a donor set that donates all its generated re- since the intersection of any (uncountable) open sources to its neighbor: Der(A) ∩ X − A = Der(A), neighborhood U of x0 with G, being an un- while for a neutral set, both Der(A) ∩ A = ∅ and countable set, is not empty. Der(A) ∩ X − A = ∅ implying that the convergence (b) By Corollary 1 of Theorem A.1.5, the uncount- resources generated in A and X − A can be de- able open set G = X − {x0 } is also closed in X posited only in the respective sets. The clopen sets because if any sequence (x1 , x2 , . . .) in G con- (see diagram 2–2 of Fig. 22) are of some special verges to some x ∈ X, then x must be in G interest as they are boundary less so that no net- as the sequence must be eventually constant in resources can be generated in this case as any such order for it to converge. But this is a contra- limit are required to be simultaneously in the set diction as G cannot be closed since it is not and its complement. countable.31 By the same reckoning, although {x0 } is not an open set because its complement Example A.1.2. (Continued). This continuation is not countable, nevertheless it follows from Example A.1.2 illustrates how sequential conver- Eq. (A.40) that should any sequence converge gence is inadequate in spaces that are not first to the only point x0 of this set, then it must countable like the uncountable set with cocountable eventually be in {x0 } so by Corollary 2 of the topology. In this topology, a sequence can converge same theorem, {x0 } becomes an open set. to a point x in the space iff it has only a finite num- (c) The identity map 1 : X → Xd , where Xd is ber of distinct terms, and is therefore eventually X with discrete topology, is not continuous be- constant. Indeed, let the complement cause the inverse image of any singleton of X d is def not open in X. Yet if a sequence converges in X G = X − F, F = {xi : xi = x, i ∈ N} to x, then its image (1(x)) = (x) must actually of the countably closed sequential set F be an open converge to x in Xd because a sequence con- neighborhood of x ∈ X. Because a sequence (x i )i∈N verges in a discrete space, as in the cofinite or in X converges to a point x ∈ X iff it is eventu- cocountable spaces, iff it is eventually constant; ally in every neighborhood (including G) of x, the this is so because each element of a discrete sequence represented by the set F cannot converge space being clopen is boundary-less. 30 This is uncountable because interchanging any two eventual terms of the sequence does not alter the sequence. 31 Note that {x} is a 1-point set but (x) is an uncountable sequence.
  • 72. 3218 A. Sengupta This pathological behavior of sequences in a dropping all basic open sets that do not inter- non Hausdorff, non first countable space does not sect. Then a (coarser) topology can be gener- arise if the discrete indexing set of sequences is re- ated from this base by taking all unions, and placed by a continuous, uncountable directed set a filter by taking all supersets according to like R for example, leading to nets in place of se- Eq. (A.30). For any given filter this expression quences. In this case the net can be in an open set may be used to extract a subclass F B as a base without having to be constant valued in order to for F. converge to a point in it as the open set can be de- fined as the complement of a closed countable part A.2. Initial and Final Topology of the uncountable net. The careful reader could The commutative diagram of Fig. contains four not have failed to notice that the burden of the sub-diagrams X − XB − f (X), Y − XB − f (X), above arguments, as also of that in the example X − XB − Y and X − f (X) − Y . Of these, the first following Theorem 4.6, is to formalize the fact that two are especially significant as they can be used to since a closed set is already defined as a countable conveniently define the topologies on X B and f (X) (respectively finite) set, the closure operation cannot −1 from those of X and Y , so that fB , fB and G add further points to it from its complement, and have some desirable continuity properties; we recall any sequence that converges in an open set in these that a function f : X → Y is continuous if inverse topologies must necessarily be eventually constant images of open sets of Y are open in X. This sim- at its point of convergence, a restriction that no ple notion of continuity needs refinement in order longer applies to a net. The cocountable topology that topologies on XB and f (X) be unambiguously thus has the very interesting property of filtering defined from those of X and Y , a requirement that out a countable part from an uncountable set, as leads to the concepts of the so-called final and initial for example the rationals in R. topologies. To appreciate the significance of these new constructs, note that if f : (X, U) → (Y, V) is a This example serves to illustrate the hard truth continuous function, there may be open sets in X that in a space that is not first countable, the sim- that are not inverse images of open — or for that plicity of sequences is not enough to describe its matter of any — subset of Y , just as it is possible topological character, and in fact “sequential con- for non-open subsets of Y to contribute to U. When vergence will be able to describe only those topolo- the triple {U, f, V} are tuned in such a manner gies in which the number of (basic) neighborhoods that these are impossible, the topologies so gener- around each point is no greater than the number ated on X and Y are the initial and final topologies of terms in the sequences”, [Willard, 1970]. It is respectively; they are the smallest (coarsest) and important to appreciate the significance of this in- largest (finest) topologies on X and Y that make terplay of convergence of sequences and nets (and f : X → Y continuous. It should be clear that every of continuity of functions of Appendix A.1) and the image and preimage continuous function is contin- topology of the underlying spaces. uous, but the converse is not true. A comparison of the defining properties (T1)– Let sat(U ) := f − f (U ) ⊆ X be the saturation (T3) of topology T with (F1)–(F3) of that of the of an open set U of X and comp(V ) := f f − (V ) = filter F, shows that a filter is very close to a topol- V f (X) ∈ Y be the component of an open set V ogy with the main difference being with regard to of Y on the range f (X) of f . Let Usat , Vcomp de- the empty set which must always be in T but never note respectively the saturations U sat = {sat(U ) : in F. Addition of the empty set to a filter yields U ∈ U} of the open sets of X and the components a topology, but removal of the empty set from a Vcomp = {comp(V ) : V ∈ V} of the open sets of Y topology need not produce the corresponding fil- whenever these are also open in X and Y respec- ter as the topology may contain nonintersecting tively. Plainly, Usat ⊆ U and Vcomp ⊆ V. sets. The distinction between the topological and Definition A.2.1. For a function e : X → (Y, V), filter-bases should be carefully noted. Thus the preimage or initial topology of X based on (generated by) e and V is (a) While the topological base may contain the def empty set, a filter-base cannot. IT{e; V} = {U ⊆ X : U = e− (V ) if V ∈ Vcomp } , (b) From a given topology, form a common base by (A.41)
  • 73. Toward a Theory of Chaos 3219 while for q : (X, U) → Y , the image or final topol- (a) f is continuous iff g is continuous, ogy of Y based on (generated by) U and q is (b) f is preimage continuous iff U1 = IT{g; V}. def FT{U; q} = {V ⊆ Y : q − (V ) = U if U ∈ Usat }. As we need the second part of these theorems in (A.42) our applications, their proofs are indicated below. The special significance of the first parts is that they Thus, the topology of (X, IT{e; V}) consists of, and ensure the converse of the usual result that the com- only of, the e-saturations of all the open sets of position of two continuous functions is continuous, e(X), while the open sets of (Y, FT{U; q}) are the namely that one of the components of a composition q-images in Y (and not just in q(X)) of all the q- is continuous whenever the composition is so. saturated open sets of X.32 The need for defining (A.41) in terms of Vcomp rather than V will be- Proof of Theorem A.2.1. If f be image continuous, come clear in the following. The subspace topol- V1 = {V1 ⊆ Y1 : f − (V1 ) ∈ U1 } and U1 = {U1 ⊆ X1 : ogy IT{i; U} of a subset A ⊆ (X, U) is a basic ex- q − (U1 ) ∈ U} are the final topologies of Y1 and X1 ample of the initial topology by the inclusion map based on the topologies of X1 and X, respectively. i : X ⊇ A → (X, U), and we take its generalization Then V1 = {V1 ⊆ Y1 : q − f − (V1 ) ∈ U} shows that h e : (A, IT{e; V}) → (Y, V) that embeds a subset A is image continuous. of X into Y as the prototype of a preimage continu- Conversely, when h is image continuous, V 1 = ous map. Clearly the topology of Y may also contain {V1 ⊆ Y1 : h− (V1 )} ∈ U} = {V1 ⊆ Y1 : open sets not in e(X), and any subset in Y − e(X) q − f − (V1 )} ∈ U}, with U1 = {U1 ⊆ X1 : q − (U1 ) ∈ may be added to the topology of Y without alter- U}, proves f − (V1 ) to be open in X1 and thereby f ing the preimage topology of X: open sets of Y not to be image continuous. in e(X) may be neglected in obtaining the preimage topology as e− (Y −e(X)) = ∅. The final topology on a quotient set by the quotient map Q : (X, U) → Proof of Theorem A.2.2.If f be preimage X/ ∼, which is just the collection of Q-images of the continuous, V1 = {V1 ⊆ Y1 : V1 = e− (V ) if V ∈ V} Q-saturated open sets of X, known as the quotient and U1 = {U1 ⊆ X1 : U1 = f − (V1 ) if V1 ∈ V1 } topology of X/ ∼, is the basic example of the image are the initial topologies of Y1 and X1 respectively. topology and the resulting space (X/ ∼, FT{U; Q}) Hence from U1 = {U1 ⊆ X1 : U1 = f − e− (V ) if V ∈ is called the quotient space. We take the generaliza- V} it follows that g is preimage continuous. tion q : (X, U) → (Y, FT{U; q}) of Q as the proto- Conversely, when g is preimage continuous, type of an image continuous function. U1 = {U1 ⊆ X1 : U1 = g − (V ) if V ∈ V} = The following results are specifically useful in {U1 ⊆ X1 : U1 = f − e− (V ) if V ∈ V} and dealing with initial and final topologies; compare V1 = {V1 ⊆ Y1 : V1 = e− (V ) if V ∈ V} show the corresponding results for open maps given later. that f is preimage continuous. Theorem A.2.1. Let (X, U) and (Y1 , V1 ) be topo- Since both Eqs. (A.41) and (A.42) are in terms logical spaces and let X1 be a set. If f : X1 → of inverse images (the first of which constitutes a (Y1 , V1 ), q : (X, U) → X1 , and h = f ◦ q : direct, and the second an inverse, problem) the im- (X, U) → (Y1 , V1 ) are functions with the topology age f (U ) = comp(V ) for V ∈ V is of interest as U1 of X1 given by FT{U; q}, then it indicates the relationship of the openness of f (a) f is continuous iff h is continuous, with its continuity. This, and other related concepts (b) f is image continuous iff V1 = FT{U; h}. are examined below, where the range space f (X) is always taken to be a subspace of Y . Openness of Theorem A.2.2. Let (Y, V) and (X1 , U1 ) be topo- a function f : (X, U) → (Y, V) is the “inverse” of logical spaces and let Y1 be a set. If f : (X1 , U1 ) → continuity, when images of open sets of X are re- Y1 , e : Y1 → (Y, V) and g = e ◦ f : (X1 , U1 ) → quired to be open in Y ; such a function is said to be (Y, V) are function with the topology V 1 of Y1 given open. Following are two of the important properties by IT{e; V}, then of open functions. 32 We adopt the convention of denoting arbitrary preimage and image continuous functions by e and q respectively even though they are not injective or surjective; recall that the embedding e : X ⊇ A → Y and the association q : X → f (X) are 1 : 1 and onto respectively.
  • 74. 3220 A. Sengupta (1) If f : (X, U) → (Y, f (U)) is an open function, continuous. Indeed, from its injectivity and conti- then so is f : (X, U) → (f (X), IT{i; f (U)}). nuity, inverse images of all open subsets of Y are The converse is true if f (X) is an open set of Y ; saturated-open in X, and openness of f ensures that thus openness of f : (X, U) → (f (X), f (U)) these are the only open sets of X the condition of implies that of f : (X, U) → (Y, V) whenever injectivity being required to exclude non-saturated f (X) is open in Y such that f (U ) ∈ V for U ∈ sets from the preimage topology. It is therefore pos- U. The truth of this last assertion follows eas- sible to rewrite Eq. (A.41) as ily from the fact that if f (U ) is an open set of f (X) ⊂ Y, then necessarily f (U ) = V f (X) U ∈ IT{e; V} ⇔ e(U ) = V if V ∈ Vcomp , (A.43) for some V ∈ V, and the intersection of two open sets of Y is again an open set of Y . and to compare it with the following criterion for an (2) If f : (X, U) → (Y, V) and g : (Y, V) → (Z, W) injective, open-continuous map f : (X, U) → (Y, V) are open functions then g ◦ f : (X, U) → (Z, W) that necessarily satisfies sat(A) = A for all A ⊆ X is also open. It follows that the condition in (1) on f (X) can be replaced by the require- U ∈ U ⇔ ({{f (U )}U ∈U = Vcomp )∧(f −1 (V )|V ∈V ∈ U). ment that the inclusion i : (f (X), IT{i; V}) → (Y, V) be an open map. This interchange of (A.44) f (X) with its inclusion i: f (X) → Y into Y is a basic result that finds application in many Final Topology. Since it is necessarily produced situations. on the range R(q) of q, the final topology is often considered in terms of a surjection. This however Collected below are some useful properties of is not necessary as, much in the spirit of the ini- the initial and final topologies that we need in this tial topology, Y − q(X) = ∅ inherits the discrete work. topology without altering anything, thereby allow- ing condition (A.42) to be restated in the following Initial Topology. In Fig. 21(b), consider Y 1 = more transparent form h(X1 ), e → i and f → h : X1 → (h(X1 ), IT{i; V}). From h− (B) = h− (B h(X1 )) for any V ∈ FT{U; q} ⇔ V = q(U ) if U ∈ Usat , (A.45) B ⊆ Y , it follows that for an open set V of Y , h− (Vcomp ) = h− (V ) is an open set of X1 which, if and to compare it with the following criterion for the topology of X1 is IT{h; V}, are the only open a surjective, open-continuous map f : (X, U) → sets of X1 . Because Vcomp is an open set of h(X1 ) in (Y, V) that necessarily satisfies f B = B for all its subspace topology, this implies that the preim- B⊆Y age topologies IT{h; V} and IT{h ; IT{i; V}} of X1 generated by h and h are the same. Thus the preimage topology of X1 is not affected if Y is re- V ∈ V ⇔ (Usat = {f − (V )}V ∈V ) ∧ (f (U )|U ∈ U ∈ V). placed by the subspace h(X1 ), the part Y − h(X1 ) (A.46) contributing nothing to IT{h; V}. A preimage continuous function e : X → (Y, V) As may be anticipated from Fig. 21, the final topol- is not necessarily an open function. Indeed, if U = ogy does not behave as well for subspaces as the ini- e− (V ) ∈ IT{e; V}, it is almost trivial to verify tial topology does. This is so because in Fig. 21(a) along the lines of the restriction of open maps to the two image continuous functions h and q are its range, that e(U ) = ee− (V ) = e(X) V , V ∈ V, connected by a preimage continuous inclusion f , is open in Y (implying that e is an open map) iff whereas in Fig. 21(b) all the three functions are e(X) is an open subset of Y (because finite in- preimage continuous. Thus quite like open func- tersections of open sets are open). A special case tions, although image continuity of h : (X, U) → of this is the important consequence that the re- (Y1 , FT{U; h}) implies that of h : (X, U) → striction e : (X, IT{e; V}) → (e(X), IT{i; V}) of (h(X), IT{i; FT{U; h})) for a subspace h(X) of e : (X, IT{h; V}) → (Y, V) to its range is an open Y1 , the converse need not be true unless — en- map. Even though a preimage continuous map need tirely like open functions again — either h(X) is not be open, it is true that an injective, continu- an open set of Y1 or i : (h(X), IT{i; FT{U; h})) → ous and open map f : X → (Y, V) is preimage (X, FT{U; h}) is an open map. Since an open
  • 75. Toward a Theory of Chaos 3221 ¨¦¤¢  § ¥ £ ¡ T W¢5 9 E C © 3 D § 1© )'$£ ¢  BA9 8¢5 C 7 @ 7 6 C VSD PGF5 Q I H 9 7 E 0 ( ¥ % # ! ¡ 2 ¢  § £ 4 U T R (a) (b) Fig. 21. Continuity in final and initial topologies. preimage continuous map is image continuous, this The following is a slightly more general form of makes i : h(X) → Y1 an ininal function and hence the restriction on the inclusion that is needed for all the three legs of the commutative diagram image image continuity to behave well for subspaces of Y . continuous. Like preimage continuity, an image continuous Theorem A.2.3. Let q : (X, U) → (Y, FT{U; q}) function q : (X, U) → Y need not be open. How- be an image continuous function. For a subspace B ever, although the restriction of an image continu- of (Y, FT{U; q}), ous function to the saturated open sets of its domain FT{IT{j; U}; q } = IT{i; FT{U; q}} is an open function, q is unrestrictedly open iff the saturation of every open set of X is also open in X. where q : (q − (B), IT{j; U}) → (B, FT{IT{j; U}; In fact it can be verified without much effort that a q }), if either q is an open map or B is an open set continuous, open surjection is image continuous. of Y . Combining Eqs. (A.43) and (A.45) gives the fol- In summary we have the useful result that an lowing criterion for ininality open preimage continuous function is image con- U and V ∈ IFT{Usat ; f ; V} tinuous and an open image continuous function is preimage continuous, where the second assertion ⇔ ({f (U )}U ∈Usat = V)(Usat = {f − (V )}V ∈V ) , follows on neglecting non-saturated open sets in X; (A.47) this is permitted in as far as the generation of the final topology is concerned, as these sets produce which reduces to the following for a homeomor- the same images as their saturations. Hence an im- phism f that satisfies both sat(A) = A for A ⊆ X age continuous function q : X → Y is preimage and f B = B for B ⊆ Y continuous iff every open set in X is saturated with U and V ∈ HOM{U; f ; V} respect to q, and a preimage continuous function ⇔ (U = {f −1 (V )}V ∈V )({f (U )}U ∈U = V) e : X → Y is image continuous iff the e-image of every open set of X is open in Y . (A.48) and compares with A.3. More on Topological Spaces U and V ∈ OC{U; f ; V} This Appendix — which completes the review of those concepts of topological spaces begun in ⇔ (sat(U ) ∈ U : {f (U )}U ∈U = Vcomp ) Tutorial 4 that are needed for a proper understand- ∧(comp(V ) ∈ V : {f − (V )}V ∈V = Usat ) ing of this work — begins with the following sum- (A.49) mary of the different possibilities in the distribu- tion of Der(A) and Bdy(A) between sets A ⊆ X for an open-continuous f . and its complement X − A, and follows it up with a
  • 76. 3222 A. Sengupta few other important topological concepts that have Lemma A.3.2. If A is a subspace of X, a sepa- been used, explicitly or otherwise, in this paper. ration of A is a pair of disjoint nonempty subsets H1 and H2 of A whose union is A neither of which Definition A.3.1 (Separation, Connected Space). contains a cluster point of the other. A is connected A separation (disconnection) of X is a pair of mutu- iff there is no separation of A. ally disjoint nonempty open (and therefore closed) subsets H1 and H2 such that X = H1 ∪ H2 . A space Proof. Let H1 and H2 be a separation of A so X is said to be connected if it has no separation, that they are clopen subsets of A whose union is that is, if it cannot be partitioned into two open or A. As H1 is a closed subset of A it follows that two closed non-empty subsets. X is separated (dis- H1 = ClX (H1 ) A, where ClX (H1 ) A is the clo- connected) if it is not connected. sure of H1 in A; hence ClX (H1 ) H2 = ∅. But as the closure of a subset is the union of the set and its It follows from the definition, that for a dis- adherents, an empty intersection signifies that H 2 connected space X the following are equivalent cannot contain any of the cluster points of H 1 . A statements. similar argument shows that H1 does not contain (a) There exist a pair of disjoint non-empty open any adherent of H2 . subsets of X that cover X, Conversely suppose that neither H1 nor H2 con- (b) There exist a pair of disjoint non-empty closed tain an adherent of the other: ClX (H1 ) H2 = ∅ subsets of X that cover X, and ClX (H2 ) H1 = ∅. Hence ClX (H1 ) A = H1 (c) There exist a pair of disjoint non-empty clopen and ClX (H2 ) A = H2 so that both H1 and H2 subsets of X that cover X, are closed in A. But since H1 = A − H2 and (d) There exists a non-empty, proper, clopen subset H2 = A−H1 , they must also be open in the relative of X. topology of A. By a connected subset is meant a subset of X that Following are some useful properties of con- is connected when provided with its relative topol- nected spaces. ogy making it a subspace of X. Thus any connected subset of a topological space must necessarily be (c1) The closure of any connected subspace of a contained in any clopen set that might intersect it: space is connected. In general, every B satis- if C and H are respectively connected and clopen fying subsets of X such that C H = ∅, then C ⊂ H be- A ⊆ B ⊆ Cl(A) cause C H is a non-empty clopen set in C which must contain C because C is connected. is connected. Thus any subset of X formed For testing whether a subset of a topological from A by adjoining to it some or all of its space is connected, the following relativized form of adherents is connected so that a topologi- (a)–(d) is often useful. cal space with a dense connected subset is connected. Lemma A.3.1. A subset A of X is disconnected iff (c2) The union of any class of connected subspaces there are disjoint open sets U and V of X satisfying of X with nonempty intersection is a con- U A=∅=V A such that A ⊆ U V, nected subspace of X. (c3) A topological space is connected iff there is a with U V A=∅ covering of the space consisting of connected (A.50) sets with nonempty intersection. Connected- or there are disjoint closed sets E and F of X ness is a topological property: Any space satisfying homeomorphic to a connected space is itself connected. E A=∅=F A such that A ⊆ E F, (c4) If H1 and H2 is a separation of X and A is any with E F A = ∅. connected subset A of X, then either A ⊆ H 1 (A.51) or A ⊆ H2 . Thus A is disconnected iff there are disjoint clopen While the real line R is connected, a subspace subsets in the relative topology of A that cover A. of R is connected iff it is an interval in R.
  • 77. Toward a Theory of Chaos 3223 X −A 1. Donor 2. Selfish (Closed) 3. Neutral X X X 1. Donor A A open A X − A open X − A open X − A open 2. Selfish A (Closed) A A open A X X X 3. Neutral A A open A Der(A) BdyX−A (A) Der(X − A) BdyA (X − A) Fig. 22. Classification of a subset A of X relative to the topology of X. The derived set of A may intersect both A and X − A (row 3), may be entirely in A (row 2), or may be wholly in X − A (row 1). A is closed iff Bdy(A) ⊆ A (row 2), open iff Bdy(A) ⊆ X − A (column 2), and clopen iff Bdy(A) = ∅ when the derived sets of both A and X − A are contained in the respective sets. An open set, beside being closed, may also be neutral or donor. The important concept of total disconnected- ply must not be contained in any other connected ness introduced below needs the following subset of X. Components can be constructively de- fined as follows: Let x ∈ X be any point. Consider Definition A.3.2 (Component). A component C ∗ the collection of all connected subsets of X to which of a space X is a maximally (with respect to x belongs. Since {x} is one such a set, the collection inclusion) connected subset of X. is non-empty. As the intersection of the collection is non-empty, its union is a non-empty connected 1 set C. This is the largest connected set containing Thus a component is a connected subspace which x and is therefore a component containing x and we is not properly contained in any larger connected have subspace of X. The maximal element need not be unique as there can be more than one component of (C1) Let x ∈ X. The unique component of X con- a given space and a “maximal” criterion rather than taining x is the union of all the connected sub- “maximum” is used as the component that need sets of X that contain x. Conversely any non- not contain every connected subset of X; it sim- empty connected subset A of X is contained
  • 78. 3224 A. Sengupta in that unique component of X to which each Table 7. Separation properties of some useful of the points of A belong. Hence a topological spaces. space is connected iff it is the unique compo- Space T0 T1 T2 nent of itself. (C2) Each component C ∗ of X is a closed set of X: Discrete By property (c1) above, Cl(C ∗ ) is also con- Indiscrete × × × nected and from C ∗ ⊆ Cl(C ∗ ) it follows that R, standard C ∗ = Cl(C ∗ ). Components need not be open left/right ray × × sets of X: an example of this is the space of ra- Infinite cofinite × tionals Q in reals in which the components are Uncountable cocountable × the individual points which cannot be open in x-inclusion/exclusion × × R; see Example 2 below. A-inclusion/exclusion × × × (C3) Components of X are equivalence classes of (X, ∼) with x ∼ y iff they are in the same component: while reflexivity and symmetry are obvious enough, transitivity follows be- (2, 3) can be enlarged into bigger connected subsets cause if x, y ∈ C1 and y, z ∈ C2 with C1 , of X. C2 connected subsets of X, then x and z As connected spaces, the empty set and the sin- are in the set C1 C2 which is connected by gleton are considered to be degenerate and any con- property c(2) above as they have the point y nected subspace with more than one point is non- in common. Components are connected dis- degenerate. At the opposite extreme of the largest joint subsets of X whose union is X (i.e. they possible component of a space X which is X itself, form a partition of X with each point of X are the singletons {x} for every x ∈ X. This leads contained in exactly one component of X) to the extremely important notion of a such that any connected subset of X can be contained in only one of them. Because Definition A.3.3 (Totally disconnected space). A a connected subspace cannot contain in it space X is totally disconnected if every pair of dis- any clopen subset of X, it follows that every tinct points in it can be separated by a disconnec- clopen connected subspace must be a compo- tion of X. nent of X. Even when a space is disconnected, it is always X is totally disconnected iff the components in possible to decompose it into pairwise disjoint con- X are single points with the only nonempty con- nected subsets. If X is a discrete space this is the nected subsets of X being the one-point sets: If only way in which X may be decomposed into con- x = y ∈ A ⊆ X are distinct points of a sub- nected pieces. If X is not discrete, there may be set A of X then A = (A H1 ) (A H2 ), where other ways of doing this. For example, the space X = H1 H2 with x ∈ H1 and y ∈ H2 is a discon- nection of X (it is possible to choose H 1 and H2 in X = {x ∈ R : (0 ≤ x ≤ 1) ∨ (2 x 3)} this manner because X is assumed to be totally dis- has the following distinct decomposition into three connected), is a separation of A that demonstrates connected subsets: that any subspace of a totally disconnected space 1 1 7 7 with more than one point is disconnected. X = 0, ,1 2, ,3 A totally disconnected space has interesting 2 2 3 3 physically appealing separation properties in terms ∞ 1 1 of the (separated) Hausdorff spaces; here a topo- X = {0} , (2, 3) n+1 n logical space X is Hausdorff, or T2 , iff each two n=1 distinct points of X can be separated by disjoint X = [0, 1] (2, 3) . neighborhoods, so that for every x = y ∈ X, there Intuition tells us that only in the third of these de- are neighborhoods M ∈ Nx and N ∈ Ny such that compositions have we really broken up X into its M N = ∅. This means that for any two distinct connected pieces. What distinguishes the third from points x = y ∈ X, it is impossible to find points that the other two is that neither of the pieces [0, 1] or are arbitrarily close to both of them. Among the
  • 79. Toward a Theory of Chaos 3225 properties of Hausdorff spaces, the following need It should be noted that that as none of the prop- to be mentioned. erties (H1)–(H3) need neighborhoods of both points simultaneously, it is sufficient for X to be T 1 for the (H1) X is Hausdorff iff for each x ∈ X and any conclusions to remain valid. point y = x, there is a neighborhood N of x From its definition it follows that any totally such that y ∈ Cl(N ). This leads to the sig- disconnected space is a Hausdorff space and is there- nificant result that for any x ∈ X the closed fore both T1 and T0 spaces as well. However, if a singleton Hausdorff space has a base of clopen sets then it is {x} = Cl(N ) totally disconnected; this is so because if x and y N ∈Nx are distinct points of X, then the assumed property is the intersection of the closures of any local of x ∈ H ⊆ M for every M ∈ Nx and some clopen base at that point, which in the language of set M yields X = H (X − H) as a disconnection nets and filters (Appendix A.1) means that a of X that separates x and y ∈ X − H; note that the net in a Hausdorff space cannot converge to assumed Hausdorffness of X allows M to be chosen more than one point in the space and the ad- so as not to contain y. herent set adh(Nx ) of the neighborhood filter Example A.3.1 at x is the singleton {x}. (H2) Since each singleton is a closed set, each fi- (1) Every indiscrete space is connected; every sub- nite set in a Hausdorff space is also closed in set of an indiscrete space is connected. Hence X. Unlike a cofinite space, however, there can if X is empty or a singleton, it is connected. A clearly be infinite closed sets in a Hausdorff discrete space is connected iff it is either empty space. or is a singleton; the only connected subsets in (H3) Any point x in a Hausdorff space X is a clus- a discrete space are the degenerate ones. This is ter point of A ⊆ X iff every neighborhood an extreme case of lack of connectedness, and a of x contains infinitely many points of A, a discrete space is the simplest example of a total fact that has led to our mental conditioning disconnected space. of the points of a (Cauchy) sequence piling up (2) Q, the set of rationals considered as a subspace in neighborhoods of the limit. Thus suppose of the real line, is (totally) disconnected because for the sake of argument that although some all rationals larger than a given irrational r is a neighborhood of x contains only a finite num- clopen set in Q, and ber of points, x is nonetheless a cluster point Q = (−∞, r) Q Q (r, ∞) of A. Then there is an open neighborhood U of x such that U (A−{x}) = {x1 , . . . , xn } is r is an irrational a finite closed set of X not containing x, and is the union of two disjoint clopen sets in the U (X − {x1 , . . . , xn }) being the intersection relative topology of Q. The sets (−∞, r) ∩ Q of two open sets, is an open neighborhood of x and Q ∩ (r, ∞) are clopen in Q because neither not intersecting A−{x} implying thereby that contains a cluster point of the other. Thus for x ∈ Der(A); infact U (X − {x1 , . . . , xn }) is example, any neighborhood of the second must simply {x} if x ∈ A or belongs to Bdy X−A (A) contain the irrational r in order to be able to cut when x ∈ X − A. Conversely if every neigh- the first which means that any neighborhood borhood of a point of X intersects A in in- of a point in either of the relatively open sets finitely many points, that point must belong cannot be wholly contained in the other. The to Der(A) by definition. only connected sets of Q are one point subsets consisting of the individual rationals. In fact, a Weaker separation axioms than Hausdorffness connected piece of Q, being a connected subset are those of T0 , respectively T1 , spaces in which of R, is an interval in R, and a nonempty in- for every pair of distinct points at least one, re- terval cannot be contained in Q unless it is a spectively each one, has some neighborhood not singleton. It needs to be noted that the individ- containing the other; the following table is a list- ual points of the rational line are not (cl)open ing of the separation properties of some useful because any open subset of R that contains a spaces. rational must also contain others different from
  • 80. 3226 A. Sengupta it. This example shows that a space need not spaces, the proof of which uses this contrapositive be discrete for each of its points to be a com- characterization of compactness. ponent and thereby for the space to be totally Theorem A.2.1. A topological space X is compact disconnected. iff each class of closed subsets of X with finite in- In a similar fashion, the set of irrationals tersection property has non-empty intersection. is (totally) disconnected because all the irra- tionals larger than a given rational is an exam- Proof. Necessity. Let X be a compact space. Let ple of a clopen set in R − Q. F = {Fα }α∈D be a collection of closed subsets of X (3) The p-inclusion (A-inclusion) topology is con- with finite FIP, and let G = {X − Fα }α∈D be the nected; a subset in this topology is connected N corresponding open sets of X. If {Gi }i=1 is a non- iff it is degenerate or contains p. For, a sub- empty finite subcollection from G, then {X −G i }Ni=1 set inherits the discrete topology if it does is the corresponding non-empty finite subcollection not contain p, and p-inclusion topology if it of F. Hence from the assumed finite intersection contains p. property of F, it must be true that (4) The cofinite (cocountable) topology on an infi- N N nite (uncountable) space is connected; a subset X− Gi = (X − Gi ) (DeMorgan’s Law) in a cofinite (cocountable) space is connected iff i=1 i=1 it is degenerate or infinite (countable). = ∅, (5) Removal of a single point may render a con- nected space disconnected and even totally so that no finite subcollection of G can cover X. disconnected. In the former case, the point Compactness of X now implies that G too cannot removed is called a cut point and in the sec- cover X and therefore ond, it is a dispersion point. Any real number Fα = (X − Gα ) = X − Gα = ∅ . is a cut point of R and it does not have any α α α dispersion point only. The proof of the converse is a simple exercise of re- (6) Let X be a topological space. Considering com- versing the arguments involving the two equations ponents of X as equivalence classes by the in the proof above. equivalence relation ∼ with Q : X → X/ ∼ denoting the quotient map, X/ ∼ is totally dis- Our interest in this theorem and its proof lies in connected: As Q− ([x]) is connected for each the following corollary — which essentially means [x] ∈ X/ ∼ in a component class of X, and that for every filter F on a compact space the adher- as any open or closed subset A ⊆ X/ ∼ is con- ent set adh(F) is not empty — from which it follows nected iff Q− (A) is open or closed, it must fol- that every net in a compact space must have a con- low that A can only be a singleton. vergent subnet. The next notion of compactness in topological Corollary.A space X is compact iff for every class spaces provides an insight of the role of non-empty A = (Aα ) of nonempty subsets of X with FIP, adherent sets of filters that lead in a natural fash- adh(A) = Aα ∈A Cl(Aα ) = ∅. ion to the concept of attractors in the dynamical systems theory that we take up next. The proof of this result for nets given by the next theorem illustrates the general approach in Definition A.3.4 (Compactness). A topological such cases which is all that is basically needed in space X is compact iff every open cover of X con- dealing with attractors of dynamical systems; com- tains a finite subcover of X. pare Theorem A.1.3. Theorem A.3.2. A topological space X is compact This definition of compactness has an useful iff each net in X adheres to X. equivalent contrapositive reformulation: For any given collection of open sets of X if none of its Proof. Necessity. Let X be a compact space, χ : finite subcollections cover X, then the entire col- D → X a net in X, and Rα the residual of α in the lection also cannot cover X. The following theorem directed set D. For the filter-base ( F Bχ(Rα ) )α∈D of is a statement of the fundamental property of com- nonempty, decreasing, nested subsets of X associ- pact spaces in terms of adherences of filters in such ated with the net χ, compactness of X requires from
  • 81. Toward a Theory of Chaos 3227 Cl(χ(Rα ) ⊇ χ(Rδ ) = ∅, that the uncountably α δ compactness of subspaces: A subspace K of a topo- intersecting subset logical space X is compact iff each open cover of K in X contains a finite cover of K. adh(F Bχ ) := Cl(χ(Rα )) A proper understanding of the distinction be- α∈D tween compactness and closedness of subspaces — of X be non-empty. If x ∈ adh(F Bχ ) then because which often causes much confusion to the non- x is in the closure of χ(Rβ ), it follows from Eq. (20) specialist — is expressed in the next two theorems. that N χ(Rβ ) = ∅33 for every N ∈ Nx , β ∈ D. As a motivation for the first that establishes that Hence χ(γ) ∈ N for some γ β so that x ∈ adh(χ); not every subset of a compact space need be com- see Eq. (A.16). pact, mention may be made of the subset (a, b) of Sufficiency. Let χ be a net in X that adheres the compact closed interval [a, b] in R. at x ∈ X. From any class F of closed subsets of Theorem A.3.3. A closed subset F of a compact X with FIP, construct as in the proof of Theo- space X is compact. rem A.1.4, a decreasing nested sequence of closed subsets Cβ = α β∈D {Fα : Fα ∈ F} and consider Proof. Let G be an open cover of F so that an the directed set D Cβ = {(Cβ , β) : (β ∈ D)(xβ ∈ open cover of X is G (X − F ), which because Cβ )} with its natural direction (A.11) to define the of compactness of X contains a finite subcover U. net χ(Cβ , β) = xβ in X; see Definition A.1.10. From Then U − (X − F ) is a finite collection of G that the assumed adherence of χ at some x ∈ X, it covers F . follows that N F = ∅ for every N ∈ Nx and It is not true in general that a compact subset F ∈ F. Hence x belongs to the closed set F so that of a space is necessarily closed. For example, in an x ∈ adh(F); see Eq. (A.24). Hence X is compact. infinite set X with the cofinite topology, let F be an infinite subset of X with X − F also infinite. Then although F is not closed in X, it is neverthe- Using Theorem A.1.5 that specifies a definite less compact because X is compact. Indeed, let G criterion for the adherence of a net, this theorem be an open cover of X and choose any non-empty reduces to the useful formulation that a space is G0 ∈ G. If G0 = X then {G0 } is the required fi- compact iff each net in it has some convergent sub- nite cover of X. If this is not the case, then because net. An important application is the following: Since X − G0 = {xi }n is a finite set, there is a Gi ∈ G i=1 every decreasing sequence (Fm ) of nonempty sets with xi ∈ Gi for each 1 ≤ i ≤ n, and therefore has FIP (because M Fm = FM for every finite m=1 {Gi }n is the finite cover that demonstrates the i=0 M ), every decreasing sequence of nonempty closed compactness of the cofinite space X. Compactness subsets of a compact space has nonempty intersec- of F now follows because the subspace topology on tion. For a complete metric space this is known as F is the induced cofinite topology from X. The dis- the Nested Set Theorem, and for [0, 1] and other tinguishing feature of this topology is that it, like compact subspaces of R as the Cantor Intersection the cocountable, is not Hausdorff: If U and V are Theorem.34 any two nonempty open sets of X, then they can- For subspaces A of X, it is the relative topology not be disjoint as the complements of the open sets that determines as usual compactness of A; however can only be finite and if U V were to be indeed the following criterion renders this test in terms of empty, then the relative topology unnecessary and shows that the topology of X itself is sufficient to determine X = X − ∅ = X − (U V ) = (X − U ) (X − V ) 33 This is of course a triviality if we identify each χ(Rβ ) (or F in the proof of the converse that follows) with a neighborhood N of X that generates a topology on X. 34 Nested-set theorem. If (En ) is a decreasing sequence of nonempty, closed, subsets of a complete metric space (X, d) such that limn→∞ dia(En ) = 0, then there is a unique point ∞ x∈ En . n=0 The uniqueness arises because the limiting condition on the diameters of En imply, from property (H1), that (X, d) is a Hausdorff space.
  • 82. 3228 A. Sengupta would be a finite set. An immediate fallout of this is Vy is a neighborhood of x and the intersection is that in an infinite cofinite space, a sequence (x i )i∈N over finitely many points y of A. To prove that K is (and even a net) with xi = xj for i = j behaves in closed in X it is enough to show that V is disjoint an extremely unusual way: It converges, as in the from K: If there is indeed some z ∈ V K then z indiscrete space, to every point of the space. Indeed must be in some Uy for y ∈ A. But as z ∈ V it is if x ∈ X, where X is an infinite set provided with also in Vy which is impossible as Uy and Vy are to its cofinite topology, and U is any neighborhood of be disjoint. This last part of the argument in fact x, any infinite sequence (xi )i∈N in X must be even- shows that if K is a compact subspace of a Haus- tually in U because X − U is finite, and ignoring dorff space X and x ∈ K, then there are disjoint / of the initial set of its values lying in X − U in no open sets U and V of X containing x and K. way alters the ultimate behavior of the sequence (note that this implies that the filter induced on The last two theorems may be combined to give X by the sequence agrees with its topology). Thus the obviously important xi → x for any x ∈ X is a reflection of the fact that there are no small neighborhoods of any point of X Corollary. In a compact Hausdorff space, closed- with every neighborhood being almost the whole of ness and compactness of its subsets are equivalent X, except for a null set consisting of only a finite concepts. number of points. This is in sharp contrast with Hausdorff spaces where, although every finite set is In the absence of Hausdorffness, it is not pos- also closed, every point has arbitrarily small neigh- sible to conclude from the assumed compactness of borhoods that lead to unique limits of sequences. A the space that every point to which the net may corresponding result for cocountable spaces can be converge actually belongs to the subspace. found in Example A.1.2, continued. Definition A.3.5. A subset D of a topological This example of the cofinite topology motivates space (X, U) is dense in X if Cl(D) = X. Thus the following “converse” of the previous theorem. the closure of D is the largest open subset of X, Theorem A.3.4. Every compact subspace of a and every neighborhood of any point of X contains Hausdorff space is closed. a point of D not necessarily distinct from it; refer to the distinction between Eqs. (20) and (22). Proof. Let K be a non-empty compact subset of X, Fig. 23, and let x ∈ X − K. Because of the sep- Loosely, D is dense in X iff every point of X aration of X, for every y ∈ K there are disjoint has points of D arbitrarily close to it. A self-dense open subsets Uy and Vy of X with y ∈ Uy , and (dense in itself ) set is a set without any isolated x ∈ Vy . Hence {Uy }y∈K is an open cover for K, and points; hence A is self-dense iff A ⊆ Der(A). A from its compactness there is a finite subset A of closed self-dense set is called a perfect set so that K such that K ⊆ y∈A Uy with V = y∈A Vy an a closed set A is perfect iff it has no isolated points. open neighborhood of x; V is open because each Accordingly A is perfect ⇔ A = Der(A) , means that the closure of a set without any isolated points is a perfect set. # Theorem A.3.5. The following are equivalent . statements. ! 1)$ ¢ ' $ 0 ( % (1) D is dense in X. (2) If F is any closed set of X with D ⊆ F, then   F = X; thus the only closed superset of D is . X. ©§¥£¡ ¨¦ ¤¢ (3) Every nonempty (basic) open set of X cuts D; thus the only open set disjoint from D is the empty set ∅. Fig. 23. Closedness of compact subsets of a Hausdorff space. (4) The exterior of D is empty.
  • 83. Toward a Theory of Chaos 3229 XX X XX X XX X £¡£¡£¡£¤ ¡¡¡¤£¥£¥ ¥¡¥¡¥¡¥¤£ ¨¡¨¤¨¡¨¡ ¡¤¡¡¨©¨© ©¡©¤©¡©¡¨ Der(A) = = ∅ Der(A) = ∅ Der(A) ∅  ¢¡ ¢ ¢  £¡£¡£¤¥£¥£¥ ¡¢ ¥¡¥¡¥¤  ¢¡ £¡£¡£¤£¥ ¡  ¥¡¥¡¥¤ £ ¢¡ ¡¡¤  ¢¡¢ ¥¡¥¡¥¤£¥ ¥¡ ¥ ¥ ¥ £¡ £¡£¡£¡£¤ ¥¡¥¡¥¡¥¤ £¡£ £ £ ¥¡¡¡¤ ¥¡¥ ¥ ¥ ¡£¡£¡£¤ ¦§ ¨¡¨¤¨¡¨¡© ©©¡©¤©¡©¡¨ ¨¨©¡¨¤¨¡¨¡ ¡©¤©¡©¡©¨ ¡¨¤¨¡¨¡©¨© ¨©¡©¤©¡©¡¨ ¡¨¤¨¡¨¡© AA A ¡ £¡£¡£¤ AA A ¡¡¡¤£ £¡£ £ £ ¥¡¡¡¤¥ ¥¡¥ ¥ ¥ £¡£¡ ¨©¡©¤©¡©¡¨ CC= = Der(C) C= Der(C) Der(C) ¡¨¤¨¡¨¡© ¨¡©¤©¡©¡ ¡¨¤¨¡¨¡ ©¡©¤©¡©¡ ¨¡¨¤¨¡¨¡ ¥ © © © © (a) (a) Aisisolated (a)A is isisolated (a)AA isolated isolated (b) A Aisnwd (b) Ais isnwd (b) nwd (b) A is nwd (c)(c)CisisCantor (c) C is Cantor (c)C C isCantor Cantor Fig. 24. Shows the distinction between isolated, nowhere dense and Cantor sets. Topologically, the Cantor set can be described as a perfect, nowhere dense, totally disconnected and compact subset of a space. (b) The closed nowhere dense set Cl(A) is the boundary of its open complement. Here downward and upward inclined hatching denote respectively Bdy A (X − A) and BdyX−A (A). Proof. (3) If U indeed is a non-empty open set of closure, that is A ⊆ Cl(X − Cl(A)). In particu- X with U D = ∅, then D ⊆ X − U = X leads lar a closed subset A is nowhere dense in X iff to the contradiction X = Cl(D) ⊆ Cl(X − U ) = A = Bdy(A), that is iff it contains no open set. X − U = X, which also incidentally proves (2). (2) From M ⊆ N ⇒ Cl(M ) ⊆ Cl(N ) it follows, From (3) it follows that for any open set U of with M = X − Cl(A) and N = X − A, that X, Cl(U ) = Cl(U D) because if V is any open a nowhere dense set is residual, but a residual neighborhood of x ∈ Cl(U ) then V U is a non- set need not be nowhere dense unless it is also empty open set of X that must cut D so that closed in X. V (U D) = ∅ implies x ∈ Cl(U D). Finally, (3) Since Cl(Cl(A)) = Cl(A), Cl(A) is nowhere Cl(U D) ⊆ Cl(U ) completes the proof. dense in X iff A is. (4) For any A ⊆ X, both Bdy A (X − A) := Cl(X − Definition A.3.6. (a) A set A ⊆ X is said to be A) A and BdyX−A (A) := Cl(A) (X − nowhere dense in X if Int(Cl(A)) = ∅ and residual A) are residual sets and as Fig. 22 shows in X if Int(A) = ∅. BdyX (A) = BdyX−A (A) BdyA (X − A) is the A is nowhere dense in X iff union of these two residual sets. When A is Bdy(X − Cl(A)) = Bdy(Cl(A)) = Cl(A) closed (or open) with X its boundary, con- so that sisting of the only component Bdy A (X − A) (or BdyX−A (A)) as shown by the second row Cl(X − Cl(A)) = (X − Cl(A)) Cl(A) = X (or column) of the figure, being a closed set from which it follows that of X is also nowhere dense in X; in fact a closed nowhere dense set is always the bound- A is nwd in X ⇔ X − Cl(A) is dense in X ary of some open set. Otherwise, the bound- and ary components of the two residual parts — A is residual in X ⇔ X − A is dense in X . as in the donor–donor, donor–neutral, neutral– donor and neutral–neutral cases — need not Thus A is nowhere dense iff Ext(A) := X − be individually closed in X (although their Cl(A) is dense in X, and in particular, a closed set 111 union is) and their union is a residual set that is nowhere dense in X iff its complement is open need not be nowhere dense in X: the union dense in X with open-denseness being complimen- of two nowhere dense sets is nowhere dense tarily dual to closed-nowhere denseness. The ratio- but the union of a residual and a nowhere nals in reals is an example of a set that is residual dense set is a residual set. One way in which but not nowhere dense. The following are readily a two-component boundary can be nowhere verifiable properties of subsets of X. dense is by having BdyA (X − A) ⊇ Der(A) or (1) A set A ⊆ X is nowhere dense in X iff it is BdyX−A (A) ⊇ Der(X − A), so that it is effec- contained in its own boundary, iff it is con- tively in one piece rather than in two, as show in tained in the closure of the complement of its Fig. 24(b).
  • 84. 3230 A. Sengupta Theorem A.3.6. A is nowhere dense in X iff each x1 (0) x2 (0) non-empty open set of X has a non-empty open sub- . . C0 (1) (1) set disjoint from Cl(A). x2 x3 . . C1 (2) (2) (2) (2) x2 x3 x6 x7 Proof. If U is a non-empty open set of X, then . . . . C2 U0 = U ∩ Ext(A) = ∅ as Ext(A) is dense in X; U 0 (3) x3 (3) x7 (3) x10 (3) x14 is the open subset that is disjoint from Cl(A). It . . . . C3 clearly follows from this that each non-empty open C4 set of X has a non-empty open subset disjoint from a nowhere dense set A. C What this result (which follows just from the Fig. 25. Construction of the classical 1/3-Cantor set. The definition of nowhere dense sets) actually means is 1 end points of C3 , for example, in increasing order are: |0, 27 |; that no point in BdyX−A (A) can be isolated in it. | 27 , 9 |; | 2 , 27 |; | 27 , 1 |; | 2 , 19 |; | 27 , 7 |; | 9 , 25 |; | 26 , 1|. Ci is 2 1 7 8 20 8 9 3 3 27 9 27 27 the union of 2i pairwise disjoint closed intervals each of length Corollary. A is nowhere dense in X iff Cl(A) does 3−i and the non-empty infinite intersection C = ∩∞ Ci i=0 not contain any non-empty open set of X iff any is the adherent Cantor set of the filter-base of closed sets nonempty open set that contains A also contains its {C0 , C1 , C2 , . . .}. closure. Example A.3.2. Each finite subset of Rn is number — it follows that both rationals and irra- nowhere dense in Rn ; the set {1/n}∞ is nowhere tionals belong to the Cantor set. n=1 dense in R. The Cantor set C is nowhere dense in (C1) C is totally disconnected. If possible, let C [0, 1] because every neighborhood of any point in have a component containing points a and b C must contain, by its very construction, a point with a b. Then [a, b] ⊆ C ⇒ [a, b] ⊆ Ci for with 1 in its ternary representation. That the in- all i. But this is impossible because we may terior and the interior of the closure of a set are choose i large enough to have 3−i b − a not necessarily the same is seen in the example so that a and b must belong to two differ- of the rationals in reals: The set of rational num- ent members of the pairwise disjoint closed bers Q has empty interior because any neighbor- 2i subintervals each of length 3−i that consti- hood of a rational number contains irrational num- tutes Ci . Hence bers (so also is the case for irrational numbers) and R = Int(Cl(Q)) ⊇ Int(Q) = ∅ justifies the notion of [a, b] is not a subset of any Ci ⇒ [a, b] a nowhere dense set. is not a subset of C . The following properties of C can be taken to (C2) C is perfect so that for any x ∈ C every define any subset of a topological space as a Can- neighborhood of x must contain some other tor set; set-theoretically it should be clear from its point of C. Supposing to the contrary that the classical middle-third construction that the Cantor singleton {x} is an open set of C, there must set consists of all points of the closed interval [0, be an ε 0 such 1 that in the usual topology 1] whose infinite triadic (base 3) representation, ex- of R pressed so as not to terminate with an infinite string {x} = C (x − ε, x + ε) . (A.52) of 1’s, does not contain the digit 1. Accordingly, any end point of the infinite set of closed intervals whose Choose a positive integer i large enough to intersection yields the Cantor set, is represented by satisfy 3−i ε. Since x is in every Ci , it must a repeating string of either 0 or 2 while a non end be in one of the 2i pairwise disjoint closed in- point has every other arbitrary collection of these tervals [a, b] ⊂ (x − ε, x + ε) each of length two digits. Recalling that any number in [0, 1] is a 3−i whose union is Ci . As [a, b] is an interval, rational iff its representation in any base is termi- at least one of the end points of [a, b] is dif- nating or recurring — thus any decimal that neither ferent from x, and since an end point belongs repeats or terminates but consists of all possible se- to C, C ∩ (x − ε, x + ε) must also contain this quences of all possible digits represents an irrational point thereby violating Eq. (A.52).
  • 85. Toward a Theory of Chaos 3231 (C3) C is nowhere dense because each neighbor- where λ(ν) is the usual combination coefficient hood of any point of C intersects Ext(C); see of the solutions of the homogeneous and non- Theorem A.3.6. homogeneous parts of a linear equation, P(·) is a (C4) C is compact because it is a closed subset principal value and δ(x) the Dirac delta, to lead contained in the compact subspace [0, 1] of to the full-range −1 ≤ µ ≤ 1 solution valid for R, see Theorem A.3.3. The compactness of −∞ x ∞ [0, 1] follows from the Heine-Borel Theorem Φ(x, µ) = a(ν0 )e−x/ν0 φ(µ, ν0 ) which states that any subset of the real line is compact iff it is both closed and bounded + a(−ν0 )ex/ν0 φ(−ν0 , µ) with respect to the Euclidean metric on R. 1 + a(ν)e−x/ν φ(µ, ν)dν (A.56) Compare (C1) and (C2) with the essentially −1 similar arguments of Example A.3.1(2) for the sub- of the one-speed neutron transport equation (A.53). space of rationals in R. Here the real ν0 and ν satisfy respectively the inte- gral constraints A.4. Neutron Transport Theory cν0 ν0 + 1 ln = 1, |ν0 | 1 2 ν0 − 1 This section introduces the reader to the basics of the linear neutron transport theory where graphi- cν 1 + ν λ(ν) = 1 − ln , ν ∈ [−1, 1] , cal convergence approximations to the singular dis- 2 1−ν tributions, interpreted here as multifunctions, led with to the study of this paper. The one-speed (that is mono-energetic) neutron transport equation in one cν0 1 φ(µ, ν0 ) = dimension and plane geometry, is 2 ν0 − µ 1 following from Eq. (A.55). ∂Φ(x, µ) c µ + Φ(x, µ) = Φ(x, µ )dµ , It can be shown [Case Zweifel, 1967] that the ∂x 2 −1 eigenfunctions φ(ν, µ) satisfy the full-range orthog- 0 c 1, −1 ≤ µ ≤ 1 onality condition (A.53) 1 µφ(ν, µ)φ(ν , µ)dµ = N (ν)δ(ν − ν ) , where x is a non-dimensional physical space variable −1 that denotes the location of the neutron moving in where the odd normalization constants N are given a direction θ = cos−1 (µ), Φ(x, µ) is a neutron den- by sity distribution function such that Φ(x, µ)dxdµ is 1 the expected number of neutrons in a distance dx N (±ν0 ) = µφ2 (±ν0 , µ)dµ for |ν0 | 1 about the point x moving at constant speed with −1 their direction cosines of motion in dµ about µ, cν03 c 1 and c is a physical constant that will be taken to =± 2 − 2 , 2 ν0 − 1 ν0 satisfy the restriction shown above. Case’s method starts by assuming the solution to be of the form and Φν (x, µ) = e−x/µ φ(µ, ν) with a normalization inte- πcν 2 1 N (ν) = ν λ2 (ν) + for ν ∈ [−1, 1] . gral constraint of −1 φ(µ, ν)dµ = 1 to lead to the 2 simple equation With a source of particles ψ(x0 , µ) located at x = cν x0 in an infinite medium, Eq. (A.56) reduces to the (ν − µ)φ(µ, ν) = (A.54) 2 boundary condition, with µ, ν ∈ [−1, 1], for the unknown function φ(ν, µ). Case then sug- ψ(x0 , µ) = a(ν0 )e−x0 /ν0 φ(µ, ν0 ) gested, see [Case Zweifel, 1967], the non-simple complete solution of this equation to be + a(−ν0 )ex0 /ν0 φ(−ν0 , µ) 1 cν 1 φ(µ, ν) = P + λ(v)δ(v − µ) , (A.55) + a(ν)e−x0 /ν φ(µ, ν)dν (A.57) 2 ν−µ −1
  • 86. 3232 A. Sengupta for the determination of the expansion coefficients 1 a(±ν0 ), {a(ν)}ν∈[−1,1] . Use of the above orthogonal- W (µ)φ(µ, ν )φ(µ, −ν)dµ 0 ity integrals then lead to the complete solution of the problem to be cν = (ν0 + ν)φ(ν , −ν)X(−ν) 2 1 ex0 /ν a(ν) = µψ(x0 , µ)φ(µ, ν)dµ , where the half-range weight function W (µ) is N (ν) −1 defined as ν = ±ν0 or ν ∈ [−1, 1] . cµ W (µ) = (A.61) For example, in the infinite-medium Greens func- 2(1 − c)(ν0 + µ)X(−µ) tion problem with x0 = 0 and ψ(x0 , µ) = δ(µ− in terms of the X-function µ0 )/µ, the coefficients are a(±ν0 ) = φ(µ0 , ±ν0 )/ N (±ν0 ) when ν = ±ν0 , and a(ν) = φ(µ0 , ν)/N (ν) 1 c ν cν 2 for ν ∈ [−1, 1]. X(−µ) = exp − 2 N (ν) 1+ 1 − ν2 ln(ν + µ)dν , 0 For a half-space 0 ≤ x ∞, the obvious reduc- tion of Eq. (A.56) to 0 ≤ µ ≤ 1, Φ(x, µ) = a(ν0 )e−x/ν0 φ(µ, ν0 ) that is conveniently obtained from a numerical so- 1 lution of the nonlinear integral equation + a(ν)e−x/ν φ(µ, ν)dν (A.58) 1 0 cµ ν0 (1−c)−ν 2 2 Ω(−µ) = 1− 2 dν with boundary condition, µ, ν ∈ [0, 1], 2(1−c) 0 (ν0 −ν 2 )(µ+ν)Ω(−ν) (A.62) ψ(x0 , µ) = a(ν0 )e−x0 /ν0 φ(µ, ν0 ) 1 to yield −x0 /ν + a(ν)e φ(µ, ν)dν , (A.59) Ω(−µ) 0 X(−µ) = √ , µ + ν0 1 − c leads to an infinitely more difficult determination of the expansion coefficients due to the more involved and X(±ν0 ) satisfy nature of the orthogonality relations of the eigen- 2 ν0 (1 − c) − 1 functions in the half-interval [0, 1] that now reads X(ν0 )X(−ν0 ) = . for ν, ν ∈ [0, 1] [Case Zweifel, 1967] 2 2 2(1 − c)v0 (ν0 − 1) 1 W (µ)φ(µ, ν )φ(µ, ν)dµ Two other useful relations involving the W -function 1 0 are given by 0 W (µ)φ(µ, ν0 )dµ = cν0 /2 and 1 W (ν)N (ν) 0 W (µ)φ(µ, ν)dµ = cν/2. = δ(ν − ν ) The utility of these full- and half-range orthog- ν 1 onality relations lie in the fact that a suitable class W (µ)φ(µ, ν0 )φ(µ, ν)dµ = 0 of functions of the type that is involved here can al- 0 ways be expanded in its terms, see [Case Zweifel, 1 W (µ)φ(µ, −ν0 )φ(µ, ν)dµ 1967]. An example of this for a full-range problem 0 has been given above; we end this introduction to = cνν0 X(−ν0 )φ(ν, −ν0 ) (A.60) the generalized — traditionally known as singular in 1 neutron transport theory — eigenfunction method W (µ)φ(µ, ±ν0 )φ(µ, ν0 )dµ with two examples of half-range orthogonality inte- 0 grals to the half-space problems A and B of Sec. 5. cν0 2 = X(±ν0 ) 2 Problem A (The Milne Problem). In this case 1 W (µ)φ(µ, ν0 )φ(µ, −ν)dµ there is no incident flux of particles from outside 0 the medium at x = 0, but for large x 0 the c2 νν0 neutron distribution inside the medium behaves = X(−ν) 4 like ex/ν0 φ(−ν0 , µ). Hence the boundary condition
  • 87. Toward a Theory of Chaos 3233 (A.59) at x = 0 reduces to which leads, using the integral relations satisfied by W , to the expansion coefficients −φ(µ, −ν0 ) = aA (ν0 )φ(µ, ν0 ) aB (ν0 ) = −2/cν0 X(v0 ) 1 1 (A.64) + aA (ν)φ(µ, ν)dν µ≥0 aB (ν) = (1 − c)ν(ν0 + ν)X(−ν) . 0 N (ν) Use of the fourth and third equations of Eq. (A.60) where X(±ν0 ) are related to Problem A as and the explicit relation Eq. (A.61) for W (µ) gives 2 respectively the coefficients 1 ν0 (1 − c) − 1 X(ν0 ) = 2 ν0 2aA (ν0 )(1 − c)(ν0 − 1) X(−ν0 ) aA (ν0 ) = 2 X(v0 ) 1 aA (ν0 ) ν0 (1 − c) − 1 X(−ν0 ) = 2 . 1 ν0 2(1 − c)(ν0 − 1) 2 aA (ν) = − c(1 − c)ν0 νX(−ν0 )X(−ν) . N (ν) This brief introduction to the singular eigen- (A.63) function method should convince the reader of the great difficulties associated with half-space, half- The extrapolated end point z0 of Eq. (67) is re- range methods in particle transport theory; note lated to aA (ν0 ) of the Milne problem by aA (ν0 ) = that the X-functions in the coefficients above must − exp(−2z0 /ν0 ). be obtained from numerically computed tables. In contrast, full-range methods are more direct due to Problem B (The Constant Source Problem). Here the simplicity of the weight function µ, which sug- the boundary condition at x = 0 is gests the full-range formulation of half-range prob- 1 lems presented in Sec. 5. Finally it should be men- 1 = aB (ν0 )φ(µ, ν0 ) + aB (ν)φ(µ, ν)dν µ≥0 tioned that this singular eigenfunction method is 0 based on the theory of singular integral equations.