SlideShare a Scribd company logo
What is cross-entropy?
       From Riemann to Monte-Carlo
           Cross-Entropy techniques
                Cross-Entropy tricks
                           Questions




Using cross-entropy techniques for rare event
        simulation and optimization

                         Arthur Breitman

                   NYC Machine learning meetup


                         August 18, 2011




                    Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                   From Riemann to Monte-Carlo
                                                   Entropy
                       Cross-Entropy techniques
                                                   Kullback-Leibler divergence
                            Cross-Entropy tricks
                                       Questions



Outline
   What is cross-entropy?
      Entropy
      Kullback-Leibler divergence
   From Riemann to Monte-Carlo
      Riemann integration
      Monte-Carlo integration
      Importance sampling
   Cross-Entropy techniques
      Analytical expressions
      Simulation of rare events
      Optimization
      Fitting parameters
   Cross-Entropy tricks
      Multiple maxima
      Slow convergence
   Questions             Arthur Breitman           crossentropy for rare event simulation and optimization
What is cross-entropy?
                 From Riemann to Monte-Carlo
                                                   Entropy
                     Cross-Entropy techniques
                                                   Kullback-Leibler divergence
                          Cross-Entropy tricks
                                     Questions



Information entropy


   definition of information entropy
     ◮   Entropy measures disorder of a physical system
     ◮   Entropy measures information (Shannon)
     ◮   Entropy measures ignorance (E.T. Jaynes)
     ◮   Formally:
                                 H=−               p(x) ln(p(x))
                                             x∈Ω




                              Arthur Breitman      crossentropy for rare event simulation and optimization
What is cross-entropy?
                From Riemann to Monte-Carlo
                                                  Entropy
                    Cross-Entropy techniques
                                                  Kullback-Leibler divergence
                         Cross-Entropy tricks
                                    Questions



The continuous case



   In the continuous case, for a random variable X with p.d.f p(x)
   entropy is defined as

                       H(X ) = −                P(x) ln(p(x))dx
                                           Ω

   Simple, right?




                             Arthur Breitman      crossentropy for rare event simulation and optimization
What is cross-entropy?
                 From Riemann to Monte-Carlo
                                                 Entropy
                     Cross-Entropy techniques
                                                 Kullback-Leibler divergence
                          Cross-Entropy tricks
                                     Questions



The entropy of a probability distribution is meaningless



   Wrong!
     ◮   Not invariant under a change of variable
     ◮   Can even be negative!
     ◮   Not an extension of Shannon’s entropy.




                              Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                 From Riemann to Monte-Carlo
                                                 Entropy
                     Cross-Entropy techniques
                                                 Kullback-Leibler divergence
                          Cross-Entropy tricks
                                     Questions



E.T. Jaynes to the rescue


   E.T. Jaynes, adjusted the definition. Consider a sequence of
   discrete values in Ω dense in Ω, it must a approach a distribution
   m. Set
                                            p(x)
                     H(X ) = − P(x) ln             dx
                                 Ω          m(x)
   N.B. m is not necessarily a probability distribution, just a density,
   so improper priors are O.K.




                              Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                   From Riemann to Monte-Carlo
                                                   Entropy
                       Cross-Entropy techniques
                                                   Kullback-Leibler divergence
                            Cross-Entropy tricks
                                       Questions



Outline
   What is cross-entropy?
      Entropy
      Kullback-Leibler divergence
   From Riemann to Monte-Carlo
      Riemann integration
      Monte-Carlo integration
      Importance sampling
   Cross-Entropy techniques
      Analytical expressions
      Simulation of rare events
      Optimization
      Fitting parameters
   Cross-Entropy tricks
      Multiple maxima
      Slow convergence
   Questions             Arthur Breitman           crossentropy for rare event simulation and optimization
What is cross-entropy?
                  From Riemann to Monte-Carlo
                                                       Entropy
                      Cross-Entropy techniques
                                                       Kullback-Leibler divergence
                           Cross-Entropy tricks
                                      Questions



Definition of KL divergence

   Kullback-Leibler divergence: entropy of a probability distribution p
   relative to probability distribution q

                                                                    p(x)
                  DKL (P||Q) = −                      P(x) ln                   dx
                                                  Ω                 q(x)

     ◮   Similar but distinct from entropy.
     ◮   Expected number of nats (or bits) to encode data drawn from
         Q assuming it is drawn from P.
     ◮   Not symmetric!



                               Arthur Breitman         crossentropy for rare event simulation and optimization
What is cross-entropy?
                From Riemann to Monte-Carlo
                                                Entropy
                    Cross-Entropy techniques
                                                Kullback-Leibler divergence
                         Cross-Entropy tricks
                                    Questions



Why code length matter



    ◮   All ML problems ⇔ fitting a probability distribution
    ◮   KL divergence measures how concise your description is
    ◮   Relates to MDL and Solomonoff induction
    ◮   PAC-learning patches against a lack of epistemology




                             Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                From Riemann to Monte-Carlo
                                                Entropy
                    Cross-Entropy techniques
                                                Kullback-Leibler divergence
                         Cross-Entropy tricks
                                    Questions



Likelihood of parameters and Cross-Entropy



   Given a sample {q}i of Q, and {P}θ∈Θ ,

                                                                1
             LL(θ|{q}i ) = H(Pθ ) + DKL                 Pθ                    δqi
                                                                N
                                                                       i

   The likelihood of θ is the KL-divergence of Pθ w.r.t a Dirac comb.




                             Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                   From Riemann to Monte-Carlo     Riemann integration
                       Cross-Entropy techniques    Monte-Carlo integration
                            Cross-Entropy tricks   Importance sampling
                                       Questions



Outline
   What is cross-entropy?
      Entropy
      Kullback-Leibler divergence
   From Riemann to Monte-Carlo
      Riemann integration
      Monte-Carlo integration
      Importance sampling
   Cross-Entropy techniques
      Analytical expressions
      Simulation of rare events
      Optimization
      Fitting parameters
   Cross-Entropy tricks
      Multiple maxima
      Slow convergence
   Questions             Arthur Breitman           crossentropy for rare event simulation and optimization
What is cross-entropy?
                From Riemann to Monte-Carlo       Riemann integration
                    Cross-Entropy techniques      Monte-Carlo integration
                         Cross-Entropy tricks     Importance sampling
                                    Questions



Riemann integration



   How does one compute the integral of a function? Rectangle
   method:
                  b             N−1
                              1                    i
                    f (x)dx →       f a + (b − a)
                a             N                    N
                                            i=0

   Linear convergence.




                             Arthur Breitman      crossentropy for rare event simulation and optimization
What is cross-entropy?
                   From Riemann to Monte-Carlo        Riemann integration
                       Cross-Entropy techniques       Monte-Carlo integration
                            Cross-Entropy tricks      Importance sampling
                                       Questions



The curse of dimensionality



   Multiple dimensions?

       b1          bm                         N−1           N−1
                                  1                                             1
            ···         f (x)dx → m                   ···           f   a+        i ◦ (b − a)
      a1          am             N                                              N
                                              i1 =0         im =0

   Computation is exponential in m.




                                Arthur Breitman       crossentropy for rare event simulation and optimization
What is cross-entropy?
                   From Riemann to Monte-Carlo     Riemann integration
                       Cross-Entropy techniques    Monte-Carlo integration
                            Cross-Entropy tricks   Importance sampling
                                       Questions



Outline
   What is cross-entropy?
      Entropy
      Kullback-Leibler divergence
   From Riemann to Monte-Carlo
      Riemann integration
      Monte-Carlo integration
      Importance sampling
   Cross-Entropy techniques
      Analytical expressions
      Simulation of rare events
      Optimization
      Fitting parameters
   Cross-Entropy tricks
      Multiple maxima
      Slow convergence
   Questions             Arthur Breitman           crossentropy for rare event simulation and optimization
What is cross-entropy?
                From Riemann to Monte-Carlo     Riemann integration
                    Cross-Entropy techniques    Monte-Carlo integration
                         Cross-Entropy tricks   Importance sampling
                                    Questions



Monte-Carlo integration



   If P is a probability distribution over Ω, draw {x}i from P:
                                                    N
                                         1               f (xi )
                               f (x)dx ∼
                             Ω           N               p(xi )
                                                   i=1

   Very simple to implement, often p ∼ 1




                             Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                 From Riemann to Monte-Carlo     Riemann integration
                     Cross-Entropy techniques    Monte-Carlo integration
                          Cross-Entropy tricks   Importance sampling
                                     Questions



Monte-Carlo convergence



    ◮   Let random variable Xp = f (x)/p(x)
    ◮   If var(Xp ) < ∞, convergence is O(N 1/2 ) by the central-limit
        theorem!
    ◮   If m > 2, Monte-Carlo becomes attractive.




                              Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                 From Riemann to Monte-Carlo     Riemann integration
                     Cross-Entropy techniques    Monte-Carlo integration
                          Cross-Entropy tricks   Importance sampling
                                     Questions



Problems with MC




    ◮   If the mass of f is concentrated in a small region, convergence
        can be very slow.
    ◮   also a problem with Riemann integration...




                              Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                   From Riemann to Monte-Carlo     Riemann integration
                       Cross-Entropy techniques    Monte-Carlo integration
                            Cross-Entropy tricks   Importance sampling
                                       Questions



Outline
   What is cross-entropy?
      Entropy
      Kullback-Leibler divergence
   From Riemann to Monte-Carlo
      Riemann integration
      Monte-Carlo integration
      Importance sampling
   Cross-Entropy techniques
      Analytical expressions
      Simulation of rare events
      Optimization
      Fitting parameters
   Cross-Entropy tricks
      Multiple maxima
      Slow convergence
   Questions             Arthur Breitman           crossentropy for rare event simulation and optimization
What is cross-entropy?
                 From Riemann to Monte-Carlo     Riemann integration
                     Cross-Entropy techniques    Monte-Carlo integration
                          Cross-Entropy tricks   Importance sampling
                                     Questions



Importance sampling


    ◮   Sample preferably the regions of interest by picking p to
        minimize the variance of f /p
    ◮   In Riemann world, equivalent to an irregular grid
                                                                  f
    ◮   Ideal sampling distribution (if f > 0) is                     f
                                                                          , but we don’t
        know    f!
    ◮   Best convergence when χ2 of f w.r.t p is minimized




                              Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                From Riemann to Monte-Carlo     Riemann integration
                    Cross-Entropy techniques    Monte-Carlo integration
                         Cross-Entropy tricks   Importance sampling
                                    Questions



Adaptive importance sampling




    ◮   What if we don’t know the shape of f ?
    ◮   Learn it adaptively from the sampling.
    ◮   Iteratively improve the importance sampling function.




                             Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                 From Riemann to Monte-Carlo     Riemann integration
                     Cross-Entropy techniques    Monte-Carlo integration
                          Cross-Entropy tricks   Importance sampling
                                     Questions



Vegas algorithm and cross-entropy




    ◮   Vegas algorithm, use histograms and separate variables
    ◮   Cross-entropy algorithm, pick p from a family of distributions
        to minimize cross-entropy to the sample




                              Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                                                   Analytical expressions
                   From Riemann to Monte-Carlo
                                                   Simulation of rare events
                       Cross-Entropy techniques
                                                   Optimization
                            Cross-Entropy tricks
                                                   Fitting parameters
                                       Questions



Outline
   What is cross-entropy?
      Entropy
      Kullback-Leibler divergence
   From Riemann to Monte-Carlo
      Riemann integration
      Monte-Carlo integration
      Importance sampling
   Cross-Entropy techniques
      Analytical expressions
      Simulation of rare events
      Optimization
      Fitting parameters
   Cross-Entropy tricks
      Multiple maxima
      Slow convergence
   Questions             Arthur Breitman           crossentropy for rare event simulation and optimization
What is cross-entropy?
                                                  Analytical expressions
                  From Riemann to Monte-Carlo
                                                  Simulation of rare events
                      Cross-Entropy techniques
                                                  Optimization
                           Cross-Entropy tricks
                                                  Fitting parameters
                                      Questions



Why cross-entropy?



   In many cases, the expression is analytical and computationally
   cheap to derive, e.g.
     ◮   the uniform distribution
     ◮   the categorical distribution (finite, discrete)
     ◮   all the natural exponential family




                               Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                                                  Analytical expressions
                  From Riemann to Monte-Carlo
                                                  Simulation of rare events
                      Cross-Entropy techniques
                                                  Optimization
                           Cross-Entropy tricks
                                                  Fitting parameters
                                      Questions



The natural exponential distribution?



                        fX (x|θ) = h(x) exp (θ∗ x − A(θ))


     ◮   theta is the sufficient statistic
     ◮   maximum cross-entropy distribution given θ w.r.t dH
     ◮   Examples: normal, multivariate normal, gamma, binomial,
         multinomial, negative binomial




                               Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                                                 Analytical expressions
                 From Riemann to Monte-Carlo
                                                 Simulation of rare events
                     Cross-Entropy techniques
                                                 Optimization
                          Cross-Entropy tricks
                                                 Fitting parameters
                                     Questions



Beta distribution
   Not analytical! To fit, start with approximate values from the
   moment’s method
                  ¯      ¯
                  X (1 − X                                         ¯      ¯
                                                                   X (1 − X
          ¯
        α=X                               ¯
                           − 1 , β = (1 − X )                               −1
                     S2                                               S2

   The likelihood is given by
                                                            n                               n
   n(ln(Γ(α+β)−ln(Γ(α)−ln(Γ(β))+(α−1)                            ln(Xi )+(β−1)                   ln(1−Xi )
                                                           i=0                            i=0

   The first and second derivatives are the digamma and trigamma
   function, available in the gsl. Newton’s method using the Jacobian
   converges in a couple iterations. Very useful to model bounded
   variables.
                              Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                                                   Analytical expressions
                   From Riemann to Monte-Carlo
                                                   Simulation of rare events
                       Cross-Entropy techniques
                                                   Optimization
                            Cross-Entropy tricks
                                                   Fitting parameters
                                       Questions



Outline
   What is cross-entropy?
      Entropy
      Kullback-Leibler divergence
   From Riemann to Monte-Carlo
      Riemann integration
      Monte-Carlo integration
      Importance sampling
   Cross-Entropy techniques
      Analytical expressions
      Simulation of rare events
      Optimization
      Fitting parameters
   Cross-Entropy tricks
      Multiple maxima
      Slow convergence
   Questions             Arthur Breitman           crossentropy for rare event simulation and optimization
What is cross-entropy?
                                             Analytical expressions
             From Riemann to Monte-Carlo
                                             Simulation of rare events
                 Cross-Entropy techniques
                                             Optimization
                      Cross-Entropy tricks
                                             Fitting parameters
                                 Questions



Surviving the zombie hordes




              Figure: Electric fences, the horde and you

                          Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                                                 Analytical expressions
                 From Riemann to Monte-Carlo
                                                 Simulation of rare events
                     Cross-Entropy techniques
                                                 Optimization
                          Cross-Entropy tricks
                                                 Fitting parameters
                                     Questions



Simulating zombie breakouts



    ◮   Each fence (Ui , λi ) delivers u ∼ max(Ui − Exp(λi ), 0) volts.
    ◮   Crossing a fence deals u damage to a zombie
    ◮   Zombies come from everywhere and can take 5 damage hits
        each.
    ◮   Zombies outbreaks are very rare!




                              Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                                                  Analytical expressions
                  From Riemann to Monte-Carlo
                                                  Simulation of rare events
                      Cross-Entropy techniques
                                                  Optimization
                           Cross-Entropy tricks
                                                  Fitting parameters
                                      Questions



Mere integration fails!




     ◮   We can estimate this probability by sampling the random
         voltages and finding a shortest path.
     ◮   Speed of Monte-Carlo proportional to poutbreak (1 − poutbreak ),
         too slow!




                               Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                                                Analytical expressions
                From Riemann to Monte-Carlo
                                                Simulation of rare events
                    Cross-Entropy techniques
                                                Optimization
                         Cross-Entropy tricks
                                                Fitting parameters
                                    Questions



Cross-Entropy to the rescue



    ◮   We want to approximate the multivariate power distribution
        conditional on an outbreak occurring!
    ◮   Approximate the shape by changing the parameters Ui and λi
        for each fence
    ◮   Generate samples, fit Ui and λi on the samples inducing an
        outbreak




                             Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                                                   Analytical expressions
                   From Riemann to Monte-Carlo
                                                   Simulation of rare events
                       Cross-Entropy techniques
                                                   Optimization
                            Cross-Entropy tricks
                                                   Fitting parameters
                                       Questions



The elite sample

   What if the probability is so low that we don’t observe any
   outbreak in our sample?
     ◮   Generate n samplings using the sampling distribution
     ◮   If more than e samples are outbreaks, fit to those samples,
         break
     ◮   Otherwise, fit on the e best sample, the elite sample.
     ◮   Iterate
     ◮   Generate a sample, weight each points by the importance
         sampling weight, estimate probability



                                Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                                                 Analytical expressions
                 From Riemann to Monte-Carlo
                                                 Simulation of rare events
                     Cross-Entropy techniques
                                                 Optimization
                          Cross-Entropy tricks
                                                 Fitting parameters
                                     Questions



Other examples




    ◮   Modeling rare event for any complex probability distribution,
        e.g. Bayesian networks.
    ◮   Estimating tails for the sum of fat-tailed distributions




                              Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                                                   Analytical expressions
                   From Riemann to Monte-Carlo
                                                   Simulation of rare events
                       Cross-Entropy techniques
                                                   Optimization
                            Cross-Entropy tricks
                                                   Fitting parameters
                                       Questions



Outline
   What is cross-entropy?
      Entropy
      Kullback-Leibler divergence
   From Riemann to Monte-Carlo
      Riemann integration
      Monte-Carlo integration
      Importance sampling
   Cross-Entropy techniques
      Analytical expressions
      Simulation of rare events
      Optimization
      Fitting parameters
   Cross-Entropy tricks
      Multiple maxima
      Slow convergence
   Questions             Arthur Breitman           crossentropy for rare event simulation and optimization
What is cross-entropy?
                                                  Analytical expressions
                  From Riemann to Monte-Carlo
                                                  Simulation of rare events
                      Cross-Entropy techniques
                                                  Optimization
                           Cross-Entropy tricks
                                                  Fitting parameters
                                      Questions



From integration to optimization

   Using an elite sample to help convergence is a trick that does a
   form of hill climbing of a smooth function approximating the
   indicator function of the rare event.
     ◮   Interesting even if not interested in integrating f .
     ◮   Keep iterating based on an elite sample to converge towards
         one global maximum.
     ◮   variance of the sampling distribution follows the curvature of
         f.
     ◮   e.g. using a multivariate normal allows the covariance to
         reflect the differential


                               Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                                                  Analytical expressions
                  From Riemann to Monte-Carlo
                                                  Simulation of rare events
                      Cross-Entropy techniques
                                                  Optimization
                           Cross-Entropy tricks
                                                  Fitting parameters
                                      Questions



Combinatorial optimization



   One classical example if combinatorial optimization. To solve a
   TSP with Cross-Entropy:
     ◮   Assume the travel is a Markov chain on the graph nodes.
     ◮   Generate travels by coercing them to be permutations.
     ◮   Update transition probabilities from the elite sample.




                               Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                                                  Analytical expressions
                  From Riemann to Monte-Carlo
                                                  Simulation of rare events
                      Cross-Entropy techniques
                                                  Optimization
                           Cross-Entropy tricks
                                                  Fitting parameters
                                      Questions



Clustering


   CE does clustering too!
     ◮   Assign probabilities of membership to classes for each point
         (the sampling distribution).
     ◮   Sample random membership assignments.
     ◮   Use average distance to centroids to find an elite sample.
     ◮   Slower than K-means but much less sensitive to initial choice
         of centroids.




                               Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                                                  Analytical expressions
                  From Riemann to Monte-Carlo
                                                  Simulation of rare events
                      Cross-Entropy techniques
                                                  Optimization
                           Cross-Entropy tricks
                                                  Fitting parameters
                                      Questions



A form of global optimization



   Is it global optimization?
     ◮   If the sampling distribution is bounded below by a distribution
         that covers the global maximum, yes, with probability 1!
     ◮   In practice we may never see one maximum and converge to
         another local maximum.




                               Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                                                   Analytical expressions
                   From Riemann to Monte-Carlo
                                                   Simulation of rare events
                       Cross-Entropy techniques
                                                   Optimization
                            Cross-Entropy tricks
                                                   Fitting parameters
                                       Questions



Outline
   What is cross-entropy?
      Entropy
      Kullback-Leibler divergence
   From Riemann to Monte-Carlo
      Riemann integration
      Monte-Carlo integration
      Importance sampling
   Cross-Entropy techniques
      Analytical expressions
      Simulation of rare events
      Optimization
      Fitting parameters
   Cross-Entropy tricks
      Multiple maxima
      Slow convergence
   Questions             Arthur Breitman           crossentropy for rare event simulation and optimization
What is cross-entropy?
                                                  Analytical expressions
                  From Riemann to Monte-Carlo
                                                  Simulation of rare events
                      Cross-Entropy techniques
                                                  Optimization
                           Cross-Entropy tricks
                                                  Fitting parameters
                                      Questions



Fitting model parameters with CE


   Cross-Entropy techniques work generally very well for finding ML
   parameters of a model. Why?
     ◮   Models often have different sensitivities to different
         parameters, CE reflects that.
     ◮   With a covariance structure, it does a form of gradient ascent.
     ◮   But it can deal with discrete parameters at the same time!
     ◮   It does not tend to get trapped in local maxima.
     ◮   Well suited for high-dimensional parameter spaces.




                               Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                   From Riemann to Monte-Carlo
                                                   Multiple maxima
                       Cross-Entropy techniques
                                                   Slow convergence
                            Cross-Entropy tricks
                                       Questions



Outline
   What is cross-entropy?
      Entropy
      Kullback-Leibler divergence
   From Riemann to Monte-Carlo
      Riemann integration
      Monte-Carlo integration
      Importance sampling
   Cross-Entropy techniques
      Analytical expressions
      Simulation of rare events
      Optimization
      Fitting parameters
   Cross-Entropy tricks
      Multiple maxima
      Slow convergence
   Questions             Arthur Breitman           crossentropy for rare event simulation and optimization
What is cross-entropy?
                From Riemann to Monte-Carlo
                                                Multiple maxima
                    Cross-Entropy techniques
                                                Slow convergence
                         Cross-Entropy tricks
                                    Questions



Forgetting maxima



   Some maxima can be ”forgotten”
    ◮   Smooth changes in the sampling function.
    ◮   Expand the sampling function (equivalent to applying a prior
        or ”shrinkage”).
    ◮   Keep the entire sample




                             Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                 From Riemann to Monte-Carlo
                                                 Multiple maxima
                     Cross-Entropy techniques
                                                 Slow convergence
                          Cross-Entropy tricks
                                     Questions



Not converging to a maximum



  Multiple maxima may prevent variance of the sampling from
  decreasing.
    ◮   Mixtures of multivariate normals can deal with this.
    ◮   They can be introduced dynamically.
    ◮   Fit with EM.




                              Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                   From Riemann to Monte-Carlo
                                                   Multiple maxima
                       Cross-Entropy techniques
                                                   Slow convergence
                            Cross-Entropy tricks
                                       Questions



Outline
   What is cross-entropy?
      Entropy
      Kullback-Leibler divergence
   From Riemann to Monte-Carlo
      Riemann integration
      Monte-Carlo integration
      Importance sampling
   Cross-Entropy techniques
      Analytical expressions
      Simulation of rare events
      Optimization
      Fitting parameters
   Cross-Entropy tricks
      Multiple maxima
      Slow convergence
   Questions             Arthur Breitman           crossentropy for rare event simulation and optimization
What is cross-entropy?
                From Riemann to Monte-Carlo
                                                Multiple maxima
                    Cross-Entropy techniques
                                                Slow convergence
                         Cross-Entropy tricks
                                    Questions



Independent variables




   If the sampling distribution is separable, convergence can be sped
   up by sampling over one dimension at a time.




                             Arthur Breitman    crossentropy for rare event simulation and optimization
What is cross-entropy?
                From Riemann to Monte-Carlo
                    Cross-Entropy techniques
                         Cross-Entropy tricks
                                    Questions



Questions




   Questions?




                             Arthur Breitman    crossentropy for rare event simulation and optimization

More Related Content

PPTX
Complexity
PPT
2009 CSBB LAB 新生訓練
PDF
Compressed Sensing In Spectral Imaging
PPT
Variational Inference
PDF
Non-linear density estimation using a sparse Haar prior
PDF
Machine Learning
PDF
Gibbs cloner を用いた組み合わせ最適化と cross-entropy を用いた期待値推計: 道路ネットワーク強靭化のための耐震化戦略を例として
PDF
Deep Reinforcement Learning Through Policy Optimization, John Schulman, OpenAI
Complexity
2009 CSBB LAB 新生訓練
Compressed Sensing In Spectral Imaging
Variational Inference
Non-linear density estimation using a sparse Haar prior
Machine Learning
Gibbs cloner を用いた組み合わせ最適化と cross-entropy を用いた期待値推計: 道路ネットワーク強靭化のための耐震化戦略を例として
Deep Reinforcement Learning Through Policy Optimization, John Schulman, OpenAI

Similar to Simulation of rare events and optimisation with the cross-entropy method (20)

PDF
Foundation of KL Divergence
PDF
A Mini Introduction to Information Theory
PDF
thesis_final_draft
PDF
Lecture 2: Entropy and Mutual Information
PDF
Rao probability theory with applications
PDF
A probability-course-for-the-actuaries-a-preparation-for-exam-p1-marcel-b-fin...
PDF
Entropy Coding Set Shaping Theory.pdf
PDF
Slides ub-2
PDF
Non parametric inference of causal interactions
PPT
Basic Concept Of Probability
PPTX
Probability
PDF
Notes on probability 2
PDF
Kernel Entropy Component Analysis in Remote Sensing Data Clustering.pdf
PDF
Statistics (1): estimation Chapter 3: likelihood function and likelihood esti...
PPTX
Core Training Presentations- 3 Estimating an Ag Database using CE Methods
PDF
(Ebook) The Theory of Distributions. by El Mustapha Ait Ben Hassi. ISBN 97817...
PDF
PhysicsSIG2008-01-Seneviratne
PDF
Slides econometrics-2018-graduate-2
PDF
The dual geometry of Shannon information
PDF
Lecture1
Foundation of KL Divergence
A Mini Introduction to Information Theory
thesis_final_draft
Lecture 2: Entropy and Mutual Information
Rao probability theory with applications
A probability-course-for-the-actuaries-a-preparation-for-exam-p1-marcel-b-fin...
Entropy Coding Set Shaping Theory.pdf
Slides ub-2
Non parametric inference of causal interactions
Basic Concept Of Probability
Probability
Notes on probability 2
Kernel Entropy Component Analysis in Remote Sensing Data Clustering.pdf
Statistics (1): estimation Chapter 3: likelihood function and likelihood esti...
Core Training Presentations- 3 Estimating an Ag Database using CE Methods
(Ebook) The Theory of Distributions. by El Mustapha Ait Ben Hassi. ISBN 97817...
PhysicsSIG2008-01-Seneviratne
Slides econometrics-2018-graduate-2
The dual geometry of Shannon information
Lecture1
Ad

Recently uploaded (20)

PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PPTX
Cell Types and Its function , kingdom of life
PPTX
Lesson notes of climatology university.
PDF
Complications of Minimal Access Surgery at WLH
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PDF
TR - Agricultural Crops Production NC III.pdf
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PDF
RMMM.pdf make it easy to upload and study
PPTX
PPH.pptx obstetrics and gynecology in nursing
PPTX
Cell Structure & Organelles in detailed.
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PDF
Insiders guide to clinical Medicine.pdf
PDF
Computing-Curriculum for Schools in Ghana
PDF
Pre independence Education in Inndia.pdf
PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
VCE English Exam - Section C Student Revision Booklet
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
Cell Types and Its function , kingdom of life
Lesson notes of climatology university.
Complications of Minimal Access Surgery at WLH
2.FourierTransform-ShortQuestionswithAnswers.pdf
Pharmacology of Heart Failure /Pharmacotherapy of CHF
TR - Agricultural Crops Production NC III.pdf
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
Renaissance Architecture: A Journey from Faith to Humanism
RMMM.pdf make it easy to upload and study
PPH.pptx obstetrics and gynecology in nursing
Cell Structure & Organelles in detailed.
Supply Chain Operations Speaking Notes -ICLT Program
Insiders guide to clinical Medicine.pdf
Computing-Curriculum for Schools in Ghana
Pre independence Education in Inndia.pdf
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
human mycosis Human fungal infections are called human mycosis..pptx
VCE English Exam - Section C Student Revision Booklet
Ad

Simulation of rare events and optimisation with the cross-entropy method

  • 1. What is cross-entropy? From Riemann to Monte-Carlo Cross-Entropy techniques Cross-Entropy tricks Questions Using cross-entropy techniques for rare event simulation and optimization Arthur Breitman NYC Machine learning meetup August 18, 2011 Arthur Breitman crossentropy for rare event simulation and optimization
  • 2. What is cross-entropy? From Riemann to Monte-Carlo Entropy Cross-Entropy techniques Kullback-Leibler divergence Cross-Entropy tricks Questions Outline What is cross-entropy? Entropy Kullback-Leibler divergence From Riemann to Monte-Carlo Riemann integration Monte-Carlo integration Importance sampling Cross-Entropy techniques Analytical expressions Simulation of rare events Optimization Fitting parameters Cross-Entropy tricks Multiple maxima Slow convergence Questions Arthur Breitman crossentropy for rare event simulation and optimization
  • 3. What is cross-entropy? From Riemann to Monte-Carlo Entropy Cross-Entropy techniques Kullback-Leibler divergence Cross-Entropy tricks Questions Information entropy definition of information entropy ◮ Entropy measures disorder of a physical system ◮ Entropy measures information (Shannon) ◮ Entropy measures ignorance (E.T. Jaynes) ◮ Formally: H=− p(x) ln(p(x)) x∈Ω Arthur Breitman crossentropy for rare event simulation and optimization
  • 4. What is cross-entropy? From Riemann to Monte-Carlo Entropy Cross-Entropy techniques Kullback-Leibler divergence Cross-Entropy tricks Questions The continuous case In the continuous case, for a random variable X with p.d.f p(x) entropy is defined as H(X ) = − P(x) ln(p(x))dx Ω Simple, right? Arthur Breitman crossentropy for rare event simulation and optimization
  • 5. What is cross-entropy? From Riemann to Monte-Carlo Entropy Cross-Entropy techniques Kullback-Leibler divergence Cross-Entropy tricks Questions The entropy of a probability distribution is meaningless Wrong! ◮ Not invariant under a change of variable ◮ Can even be negative! ◮ Not an extension of Shannon’s entropy. Arthur Breitman crossentropy for rare event simulation and optimization
  • 6. What is cross-entropy? From Riemann to Monte-Carlo Entropy Cross-Entropy techniques Kullback-Leibler divergence Cross-Entropy tricks Questions E.T. Jaynes to the rescue E.T. Jaynes, adjusted the definition. Consider a sequence of discrete values in Ω dense in Ω, it must a approach a distribution m. Set p(x) H(X ) = − P(x) ln dx Ω m(x) N.B. m is not necessarily a probability distribution, just a density, so improper priors are O.K. Arthur Breitman crossentropy for rare event simulation and optimization
  • 7. What is cross-entropy? From Riemann to Monte-Carlo Entropy Cross-Entropy techniques Kullback-Leibler divergence Cross-Entropy tricks Questions Outline What is cross-entropy? Entropy Kullback-Leibler divergence From Riemann to Monte-Carlo Riemann integration Monte-Carlo integration Importance sampling Cross-Entropy techniques Analytical expressions Simulation of rare events Optimization Fitting parameters Cross-Entropy tricks Multiple maxima Slow convergence Questions Arthur Breitman crossentropy for rare event simulation and optimization
  • 8. What is cross-entropy? From Riemann to Monte-Carlo Entropy Cross-Entropy techniques Kullback-Leibler divergence Cross-Entropy tricks Questions Definition of KL divergence Kullback-Leibler divergence: entropy of a probability distribution p relative to probability distribution q p(x) DKL (P||Q) = − P(x) ln dx Ω q(x) ◮ Similar but distinct from entropy. ◮ Expected number of nats (or bits) to encode data drawn from Q assuming it is drawn from P. ◮ Not symmetric! Arthur Breitman crossentropy for rare event simulation and optimization
  • 9. What is cross-entropy? From Riemann to Monte-Carlo Entropy Cross-Entropy techniques Kullback-Leibler divergence Cross-Entropy tricks Questions Why code length matter ◮ All ML problems ⇔ fitting a probability distribution ◮ KL divergence measures how concise your description is ◮ Relates to MDL and Solomonoff induction ◮ PAC-learning patches against a lack of epistemology Arthur Breitman crossentropy for rare event simulation and optimization
  • 10. What is cross-entropy? From Riemann to Monte-Carlo Entropy Cross-Entropy techniques Kullback-Leibler divergence Cross-Entropy tricks Questions Likelihood of parameters and Cross-Entropy Given a sample {q}i of Q, and {P}θ∈Θ , 1 LL(θ|{q}i ) = H(Pθ ) + DKL Pθ δqi N i The likelihood of θ is the KL-divergence of Pθ w.r.t a Dirac comb. Arthur Breitman crossentropy for rare event simulation and optimization
  • 11. What is cross-entropy? From Riemann to Monte-Carlo Riemann integration Cross-Entropy techniques Monte-Carlo integration Cross-Entropy tricks Importance sampling Questions Outline What is cross-entropy? Entropy Kullback-Leibler divergence From Riemann to Monte-Carlo Riemann integration Monte-Carlo integration Importance sampling Cross-Entropy techniques Analytical expressions Simulation of rare events Optimization Fitting parameters Cross-Entropy tricks Multiple maxima Slow convergence Questions Arthur Breitman crossentropy for rare event simulation and optimization
  • 12. What is cross-entropy? From Riemann to Monte-Carlo Riemann integration Cross-Entropy techniques Monte-Carlo integration Cross-Entropy tricks Importance sampling Questions Riemann integration How does one compute the integral of a function? Rectangle method: b N−1 1 i f (x)dx → f a + (b − a) a N N i=0 Linear convergence. Arthur Breitman crossentropy for rare event simulation and optimization
  • 13. What is cross-entropy? From Riemann to Monte-Carlo Riemann integration Cross-Entropy techniques Monte-Carlo integration Cross-Entropy tricks Importance sampling Questions The curse of dimensionality Multiple dimensions? b1 bm N−1 N−1 1 1 ··· f (x)dx → m ··· f a+ i ◦ (b − a) a1 am N N i1 =0 im =0 Computation is exponential in m. Arthur Breitman crossentropy for rare event simulation and optimization
  • 14. What is cross-entropy? From Riemann to Monte-Carlo Riemann integration Cross-Entropy techniques Monte-Carlo integration Cross-Entropy tricks Importance sampling Questions Outline What is cross-entropy? Entropy Kullback-Leibler divergence From Riemann to Monte-Carlo Riemann integration Monte-Carlo integration Importance sampling Cross-Entropy techniques Analytical expressions Simulation of rare events Optimization Fitting parameters Cross-Entropy tricks Multiple maxima Slow convergence Questions Arthur Breitman crossentropy for rare event simulation and optimization
  • 15. What is cross-entropy? From Riemann to Monte-Carlo Riemann integration Cross-Entropy techniques Monte-Carlo integration Cross-Entropy tricks Importance sampling Questions Monte-Carlo integration If P is a probability distribution over Ω, draw {x}i from P: N 1 f (xi ) f (x)dx ∼ Ω N p(xi ) i=1 Very simple to implement, often p ∼ 1 Arthur Breitman crossentropy for rare event simulation and optimization
  • 16. What is cross-entropy? From Riemann to Monte-Carlo Riemann integration Cross-Entropy techniques Monte-Carlo integration Cross-Entropy tricks Importance sampling Questions Monte-Carlo convergence ◮ Let random variable Xp = f (x)/p(x) ◮ If var(Xp ) < ∞, convergence is O(N 1/2 ) by the central-limit theorem! ◮ If m > 2, Monte-Carlo becomes attractive. Arthur Breitman crossentropy for rare event simulation and optimization
  • 17. What is cross-entropy? From Riemann to Monte-Carlo Riemann integration Cross-Entropy techniques Monte-Carlo integration Cross-Entropy tricks Importance sampling Questions Problems with MC ◮ If the mass of f is concentrated in a small region, convergence can be very slow. ◮ also a problem with Riemann integration... Arthur Breitman crossentropy for rare event simulation and optimization
  • 18. What is cross-entropy? From Riemann to Monte-Carlo Riemann integration Cross-Entropy techniques Monte-Carlo integration Cross-Entropy tricks Importance sampling Questions Outline What is cross-entropy? Entropy Kullback-Leibler divergence From Riemann to Monte-Carlo Riemann integration Monte-Carlo integration Importance sampling Cross-Entropy techniques Analytical expressions Simulation of rare events Optimization Fitting parameters Cross-Entropy tricks Multiple maxima Slow convergence Questions Arthur Breitman crossentropy for rare event simulation and optimization
  • 19. What is cross-entropy? From Riemann to Monte-Carlo Riemann integration Cross-Entropy techniques Monte-Carlo integration Cross-Entropy tricks Importance sampling Questions Importance sampling ◮ Sample preferably the regions of interest by picking p to minimize the variance of f /p ◮ In Riemann world, equivalent to an irregular grid f ◮ Ideal sampling distribution (if f > 0) is f , but we don’t know f! ◮ Best convergence when χ2 of f w.r.t p is minimized Arthur Breitman crossentropy for rare event simulation and optimization
  • 20. What is cross-entropy? From Riemann to Monte-Carlo Riemann integration Cross-Entropy techniques Monte-Carlo integration Cross-Entropy tricks Importance sampling Questions Adaptive importance sampling ◮ What if we don’t know the shape of f ? ◮ Learn it adaptively from the sampling. ◮ Iteratively improve the importance sampling function. Arthur Breitman crossentropy for rare event simulation and optimization
  • 21. What is cross-entropy? From Riemann to Monte-Carlo Riemann integration Cross-Entropy techniques Monte-Carlo integration Cross-Entropy tricks Importance sampling Questions Vegas algorithm and cross-entropy ◮ Vegas algorithm, use histograms and separate variables ◮ Cross-entropy algorithm, pick p from a family of distributions to minimize cross-entropy to the sample Arthur Breitman crossentropy for rare event simulation and optimization
  • 22. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters Questions Outline What is cross-entropy? Entropy Kullback-Leibler divergence From Riemann to Monte-Carlo Riemann integration Monte-Carlo integration Importance sampling Cross-Entropy techniques Analytical expressions Simulation of rare events Optimization Fitting parameters Cross-Entropy tricks Multiple maxima Slow convergence Questions Arthur Breitman crossentropy for rare event simulation and optimization
  • 23. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters Questions Why cross-entropy? In many cases, the expression is analytical and computationally cheap to derive, e.g. ◮ the uniform distribution ◮ the categorical distribution (finite, discrete) ◮ all the natural exponential family Arthur Breitman crossentropy for rare event simulation and optimization
  • 24. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters Questions The natural exponential distribution? fX (x|θ) = h(x) exp (θ∗ x − A(θ)) ◮ theta is the sufficient statistic ◮ maximum cross-entropy distribution given θ w.r.t dH ◮ Examples: normal, multivariate normal, gamma, binomial, multinomial, negative binomial Arthur Breitman crossentropy for rare event simulation and optimization
  • 25. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters Questions Beta distribution Not analytical! To fit, start with approximate values from the moment’s method ¯ ¯ X (1 − X ¯ ¯ X (1 − X ¯ α=X ¯ − 1 , β = (1 − X ) −1 S2 S2 The likelihood is given by n n n(ln(Γ(α+β)−ln(Γ(α)−ln(Γ(β))+(α−1) ln(Xi )+(β−1) ln(1−Xi ) i=0 i=0 The first and second derivatives are the digamma and trigamma function, available in the gsl. Newton’s method using the Jacobian converges in a couple iterations. Very useful to model bounded variables. Arthur Breitman crossentropy for rare event simulation and optimization
  • 26. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters Questions Outline What is cross-entropy? Entropy Kullback-Leibler divergence From Riemann to Monte-Carlo Riemann integration Monte-Carlo integration Importance sampling Cross-Entropy techniques Analytical expressions Simulation of rare events Optimization Fitting parameters Cross-Entropy tricks Multiple maxima Slow convergence Questions Arthur Breitman crossentropy for rare event simulation and optimization
  • 27. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters Questions Surviving the zombie hordes Figure: Electric fences, the horde and you Arthur Breitman crossentropy for rare event simulation and optimization
  • 28. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters Questions Simulating zombie breakouts ◮ Each fence (Ui , λi ) delivers u ∼ max(Ui − Exp(λi ), 0) volts. ◮ Crossing a fence deals u damage to a zombie ◮ Zombies come from everywhere and can take 5 damage hits each. ◮ Zombies outbreaks are very rare! Arthur Breitman crossentropy for rare event simulation and optimization
  • 29. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters Questions Mere integration fails! ◮ We can estimate this probability by sampling the random voltages and finding a shortest path. ◮ Speed of Monte-Carlo proportional to poutbreak (1 − poutbreak ), too slow! Arthur Breitman crossentropy for rare event simulation and optimization
  • 30. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters Questions Cross-Entropy to the rescue ◮ We want to approximate the multivariate power distribution conditional on an outbreak occurring! ◮ Approximate the shape by changing the parameters Ui and λi for each fence ◮ Generate samples, fit Ui and λi on the samples inducing an outbreak Arthur Breitman crossentropy for rare event simulation and optimization
  • 31. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters Questions The elite sample What if the probability is so low that we don’t observe any outbreak in our sample? ◮ Generate n samplings using the sampling distribution ◮ If more than e samples are outbreaks, fit to those samples, break ◮ Otherwise, fit on the e best sample, the elite sample. ◮ Iterate ◮ Generate a sample, weight each points by the importance sampling weight, estimate probability Arthur Breitman crossentropy for rare event simulation and optimization
  • 32. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters Questions Other examples ◮ Modeling rare event for any complex probability distribution, e.g. Bayesian networks. ◮ Estimating tails for the sum of fat-tailed distributions Arthur Breitman crossentropy for rare event simulation and optimization
  • 33. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters Questions Outline What is cross-entropy? Entropy Kullback-Leibler divergence From Riemann to Monte-Carlo Riemann integration Monte-Carlo integration Importance sampling Cross-Entropy techniques Analytical expressions Simulation of rare events Optimization Fitting parameters Cross-Entropy tricks Multiple maxima Slow convergence Questions Arthur Breitman crossentropy for rare event simulation and optimization
  • 34. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters Questions From integration to optimization Using an elite sample to help convergence is a trick that does a form of hill climbing of a smooth function approximating the indicator function of the rare event. ◮ Interesting even if not interested in integrating f . ◮ Keep iterating based on an elite sample to converge towards one global maximum. ◮ variance of the sampling distribution follows the curvature of f. ◮ e.g. using a multivariate normal allows the covariance to reflect the differential Arthur Breitman crossentropy for rare event simulation and optimization
  • 35. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters Questions Combinatorial optimization One classical example if combinatorial optimization. To solve a TSP with Cross-Entropy: ◮ Assume the travel is a Markov chain on the graph nodes. ◮ Generate travels by coercing them to be permutations. ◮ Update transition probabilities from the elite sample. Arthur Breitman crossentropy for rare event simulation and optimization
  • 36. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters Questions Clustering CE does clustering too! ◮ Assign probabilities of membership to classes for each point (the sampling distribution). ◮ Sample random membership assignments. ◮ Use average distance to centroids to find an elite sample. ◮ Slower than K-means but much less sensitive to initial choice of centroids. Arthur Breitman crossentropy for rare event simulation and optimization
  • 37. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters Questions A form of global optimization Is it global optimization? ◮ If the sampling distribution is bounded below by a distribution that covers the global maximum, yes, with probability 1! ◮ In practice we may never see one maximum and converge to another local maximum. Arthur Breitman crossentropy for rare event simulation and optimization
  • 38. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters Questions Outline What is cross-entropy? Entropy Kullback-Leibler divergence From Riemann to Monte-Carlo Riemann integration Monte-Carlo integration Importance sampling Cross-Entropy techniques Analytical expressions Simulation of rare events Optimization Fitting parameters Cross-Entropy tricks Multiple maxima Slow convergence Questions Arthur Breitman crossentropy for rare event simulation and optimization
  • 39. What is cross-entropy? Analytical expressions From Riemann to Monte-Carlo Simulation of rare events Cross-Entropy techniques Optimization Cross-Entropy tricks Fitting parameters Questions Fitting model parameters with CE Cross-Entropy techniques work generally very well for finding ML parameters of a model. Why? ◮ Models often have different sensitivities to different parameters, CE reflects that. ◮ With a covariance structure, it does a form of gradient ascent. ◮ But it can deal with discrete parameters at the same time! ◮ It does not tend to get trapped in local maxima. ◮ Well suited for high-dimensional parameter spaces. Arthur Breitman crossentropy for rare event simulation and optimization
  • 40. What is cross-entropy? From Riemann to Monte-Carlo Multiple maxima Cross-Entropy techniques Slow convergence Cross-Entropy tricks Questions Outline What is cross-entropy? Entropy Kullback-Leibler divergence From Riemann to Monte-Carlo Riemann integration Monte-Carlo integration Importance sampling Cross-Entropy techniques Analytical expressions Simulation of rare events Optimization Fitting parameters Cross-Entropy tricks Multiple maxima Slow convergence Questions Arthur Breitman crossentropy for rare event simulation and optimization
  • 41. What is cross-entropy? From Riemann to Monte-Carlo Multiple maxima Cross-Entropy techniques Slow convergence Cross-Entropy tricks Questions Forgetting maxima Some maxima can be ”forgotten” ◮ Smooth changes in the sampling function. ◮ Expand the sampling function (equivalent to applying a prior or ”shrinkage”). ◮ Keep the entire sample Arthur Breitman crossentropy for rare event simulation and optimization
  • 42. What is cross-entropy? From Riemann to Monte-Carlo Multiple maxima Cross-Entropy techniques Slow convergence Cross-Entropy tricks Questions Not converging to a maximum Multiple maxima may prevent variance of the sampling from decreasing. ◮ Mixtures of multivariate normals can deal with this. ◮ They can be introduced dynamically. ◮ Fit with EM. Arthur Breitman crossentropy for rare event simulation and optimization
  • 43. What is cross-entropy? From Riemann to Monte-Carlo Multiple maxima Cross-Entropy techniques Slow convergence Cross-Entropy tricks Questions Outline What is cross-entropy? Entropy Kullback-Leibler divergence From Riemann to Monte-Carlo Riemann integration Monte-Carlo integration Importance sampling Cross-Entropy techniques Analytical expressions Simulation of rare events Optimization Fitting parameters Cross-Entropy tricks Multiple maxima Slow convergence Questions Arthur Breitman crossentropy for rare event simulation and optimization
  • 44. What is cross-entropy? From Riemann to Monte-Carlo Multiple maxima Cross-Entropy techniques Slow convergence Cross-Entropy tricks Questions Independent variables If the sampling distribution is separable, convergence can be sped up by sampling over one dimension at a time. Arthur Breitman crossentropy for rare event simulation and optimization
  • 45. What is cross-entropy? From Riemann to Monte-Carlo Cross-Entropy techniques Cross-Entropy tricks Questions Questions Questions? Arthur Breitman crossentropy for rare event simulation and optimization