SlideShare a Scribd company logo
Local Search and Optimization
(Based on slides of Padhraic Smyth,
Stuart Russell, Rao Kambhampati,
Raj Rao, Dan Weld…)
1
Outline
• Local search techniques and optimization
– Hill-climbing
– Gradient methods
– Simulated annealing
– Genetic algorithms
– Issues with local search
2
Local search and optimization
• Previous lecture: path to goal is solution to problem
– systematic exploration of search space.
• This lecture: a state is solution to problem
– for some problems path is irrelevant.
– E.g., 8-queens
• Different algorithms can be used
– Depth First Branch and Bound
– Local search
3
Goal
Satisfaction
Optimization
reach the goal node
Constraint satisfaction
optimize(objective fn)
Constraint Optimization
You can go back and forth between the two problems
Typically in the same complexity class
© Mausam 4
Local search and optimization
• Local search
– Keep track of single current state
– Move only to neighboring states
– Ignore paths
• Advantages:
– Use very little memory
– Can often find reasonable solutions in large or infinite (continuous)
state spaces.
• “Pure optimization” problems
– All states have an objective function
– Goal is to find state with max (or min) objective value
– Does not quite fit into path-cost/goal-state formulation
– Local search can do quite well on these problems.
5
Trivial Algorithms
• Random Sampling
– Generate a state randomly
• Random Walk
– Randomly pick a neighbor of the current state
• Both algorithms asymptotically complete.
© Mausam 6
Hill-climbing (Greedy Local Search)
max version
function HILL-CLIMBING( problem) return a state that is a local maximum
input: problem, a problem
local variables: current, a node.
neighbor, a node.
current  MAKE-NODE(INITIAL-STATE[problem])
loop do
neighbor  a highest valued successor of current
if VALUE [neighbor] ≤ VALUE[current] then return STATE[current]
current  neighbor
min version will reverse inequalities and look for
lowest valued successor
7
Hill-climbing search
• “a loop that continuously moves towards increasing value”
– terminates when a peak is reached
– Aka greedy local search
• Value can be either
– Objective function value
– Heuristic function value (minimized)
• Hill climbing does not look ahead of the immediate neighbors
• Can randomly choose among the set of best successors
– if multiple have the best value
• “climbing Mount Everest in a thick fog with amnesia”
8
“Landscape” of search
Hill Climbing gets stuck in local minima
depending on?
9
Example: n-queens
• Put n queens on an n x n board with no two
queens on the same row, column, or diagonal
• Is it a satisfaction problem or optimization?
10
Hill-climbing search: 8-queens problem
• Need to convert to an optimization problem
• h = number of pairs of queens that are attacking each other
• h = 17 for the above state
11
Search Space
• State
– All 8 queens on the board in some configuration
• Successor function
– move a single queen to another square in the same
column.
• Example of a heuristic function h(n):
– the number of pairs of queens that are attacking each
other
– (so we want to minimize this)
12
Hill-climbing search: 8-queens problem
• Is this a solution?
• What is h?
13
Hill-climbing on 8-queens
• Randomly generated 8-queens starting states…
• 14% the time it solves the problem
• 86% of the time it get stuck at a local minimum
• However…
– Takes only 4 steps on average when it succeeds
– And 3 on average when it gets stuck
– (for a state space with 8^8 =~17 million states)
14
15
Hill Climbing Drawbacks
• Local maxima
• Plateaus
• Diagonal ridges
Escaping Shoulders: Sideways Move
• If no downhill (uphill) moves, allow sideways moves
in hope that algorithm can escape
– Need to place a limit on the possible number of sideways
moves to avoid infinite loops
• For 8-queens
– Now allow sideways moves with a limit of 100
– Raises percentage of problem instances solved from 14 to
94%
– However….
• 21 steps for every successful solution
• 64 for each failure
16
Tabu Search
• prevent returning quickly to the same state
• Keep fixed length queue (“tabu list”)
• add most recent state to queue; drop oldest
• Never make the step that is currently tabu’ed
• Properties:
– As the size of the tabu list grows, hill-climbing will asymptotically
become “non-redundant” (won’t look at the same state twice)
– In practice, a reasonable sized tabu list (say 100 or so) improves
the performance of hill climbing in many problems
17
Escaping Shoulders/local Optima
Enforced Hill Climbing
• Perform breadth first search from a local optima
– to find the next state with better h function
• Typically,
– prolonged periods of exhaustive search
– bridged by relatively quick periods of hill-climbing
• Middle ground b/w local and systematic search
© Mausam 18
Hill-climbing: stochastic variations
• Stochastic hill-climbing
– Random selection among the uphill moves.
– The selection probability can vary with the steepness of the uphill move.
• To avoid getting stuck in local minima
– Random-walk hill-climbing
– Random-restart hill-climbing
– Hill-climbing with both
19
Hill Climbing: stochastic variations
When the state-space landscape has local
minima, any search that moves only in the
greedy direction cannot be complete
Random walk, on the other hand, is
asymptotically complete
Idea: Put random walk into greedy hill-climbing
20
Hill-climbing with random restarts
• If at first you don’t succeed, try, try again!
• Different variations
– For each restart: run until termination vs. run for a fixed time
– Run a fixed number of restarts or run indefinitely
• Analysis
– Say each search has probability p of success
• E.g., for 8-queens, p = 0.14 with no sideways moves
– Expected number of restarts?
– Expected number of steps taken?
• If you want to pick one local search algorithm, learn this one!!
21
Hill-climbing with random walk
• At each step do one of the two
– Greedy: With prob p move to the neighbor with largest value
– Random: With prob 1-p move to a random neighbor
Hill-climbing with both
• At each step do one of the three
– Greedy: move to the neighbor with largest value
– Random Walk: move to a random neighbor
– Random Restart: Resample a new current state
© Mausam 22
Simulated Annealing
• Simulated Annealing = physics inspired twist on random walk
• Basic ideas:
– like hill-climbing identify the quality of the local improvements
– instead of picking the best move, pick one randomly
– say the change in objective function is 
– if  is positive, then move to that state
– otherwise:
• move to this state with probability proportional to 
• thus: worse moves (very large negative ) are executed less often
– however, there is always a chance of escaping from local maxima
– over time, make it less likely to accept locally bad moves
– (Can also make the size of the move random as well, i.e., allow “large”
steps in state space)
23
Physical Interpretation of Simulated Annealing
• A Physical Analogy:
• imagine letting a ball roll downhill on the function surface
– this is like hill-climbing (for minimization)
• now imagine shaking the surface, while the ball rolls,
gradually reducing the amount of shaking
– this is like simulated annealing
• Annealing = physical process of cooling a liquid or metal
until particles achieve a certain frozen crystal state
• simulated annealing:
– free variables are like particles
– seek “low energy” (high quality) configuration
– slowly reducing temp. T with particles moving around randomly
24
Simulated annealing
function SIMULATED-ANNEALING( problem, schedule) return a solution state
input: problem, a problem
schedule, a mapping from time to temperature
local variables: current, a node.
next, a node.
T, a “temperature” controlling the prob. of downward steps
current  MAKE-NODE(INITIAL-STATE[problem])
for t  1 to ∞ do
T  schedule[t]
if T = 0 then return current
next  a randomly selected successor of current
∆E  VALUE[next] - VALUE[current]
if ∆E > 0 then current  next
else current  next only with probability e∆E /T
25
Temperature T
• high T: probability of “locally bad” move is higher
• low T: probability of “locally bad” move is lower
• typically, T is decreased as the algorithm runs longer
• i.e., there is a “temperature schedule”
26
Simulated Annealing in Practice
– method proposed in 1983 by IBM researchers for solving
VLSI layout problems (Kirkpatrick et al, Science, 220:671-
680, 1983).
• theoretically will always find the global optimum
– Other applications: Traveling salesman, Graph partitioning,
Graph coloring, Scheduling, Facility Layout, Image
Processing, …
– useful for some problems, but can be very slow
• slowness comes about because T must be decreased very
gradually to retain optimality
27
Local beam search
• Idea: Keeping only one node in memory is an
extreme reaction to memory problems.
• Keep track of k states instead of one
– Initially: k randomly selected states
– Next: determine all successors of k states
– If any of successors is goal  finished
– Else select k best from successors and repeat
28
Local Beam Search (contd)
• Not the same as k random-start searches run in parallel!
• Searches that find good states recruit other searches to
join them
• Problem: quite often, all k states end up on same local hill
• Idea: Stochastic beam search
– Choose k successors randomly, biased towards good ones
• Observe the close analogy to natural selection!
29
Hey! Perhaps
sex can improve
search?
30
Sure! Check out
ye book.
31
Genetic algorithms
• Twist on Local Search: successor is generated by combining two parent states
• A state is represented as a string over a finite alphabet (e.g. binary)
– 8-queens
• State = position of 8 queens each in a column
• Start with k randomly generated states (population)
• Evaluation function (fitness function):
– Higher values for better states.
– Opposite to heuristic function, e.g., # non-attacking pairs in 8-queens
• Produce the next generation of states by “simulated evolution”
– Random selection
– Crossover
– Random mutation
32
8
7
6
5
4
3
2
1
String representation
16257483
Can we evolve 8-queens through genetic algorithms?
33
Evolving 8-queens
?
Sorry!
Wrong queens
34
Genetic algorithms
• Fitness function: number of non-attacking pairs of queens (min = 0, max = 8 ×
7/2 = 28)
• 24/(24+23+20+11) = 31%
• 23/(24+23+20+11) = 29% etc
4 states for
8-queens
problem
2 pairs of 2 states
randomly selected based
on fitness. Random
crossover points selected
New states
after crossover
Random
mutation
applied
35
Genetic algorithms
Has the effect of “jumping” to a completely different new
part of the search space (quite non-local)
36
Comments on Genetic Algorithms
• Genetic algorithm is a variant of “stochastic beam search”
• Positive points
– Random exploration can find solutions that local search can’t
• (via crossover primarily)
– Appealing connection to human evolution
• “neural” networks, and “genetic” algorithms are metaphors!
• Negative points
– Large number of “tunable” parameters
• Difficult to replicate performance from one problem to another
– Lack of good empirical studies comparing to simpler methods
– Useful on some (small?) set of problems but no convincing evidence
that GAs are better than hill-climbing w/random restarts in general
37
Optimization of Continuous Functions
• Discretization
– use hill-climbing
• Gradient descent
– make a move in the direction of the gradient
• gradients: closed form or empirical
38
39
Gradient Descent
Assume we have a continuous function: f(x1,x2,…,xN)
and we want minimize over continuous variables X1,X2,..,Xn
1. Compute the gradients for all i: f(x1,x2,…,xN) /xi
2. Take a small step downhill in the direction of the gradient:
xi  xi - λf(x1,x2,…,xN) /xi
3. Repeat.
• How to select λ
– Line search: successively double
– until f starts to increase again
40

More Related Content

PDF
04-local-search.pdf
PPT
BeyondClassicalSearch.ppt
PPT
BeyondClassicalSearch.ppt
PPTX
Chap 4 local_search
PPT
fai unit 4 in this your will find the da
PDF
02LocalSearch.pdf
PPTX
cs-171-05-LocalSearch.pptx
PPT
Problem Solving by Searching and Optimization
04-local-search.pdf
BeyondClassicalSearch.ppt
BeyondClassicalSearch.ppt
Chap 4 local_search
fai unit 4 in this your will find the da
02LocalSearch.pdf
cs-171-05-LocalSearch.pptx
Problem Solving by Searching and Optimization

Similar to local-search and optimization slides.ppt (20)

PPTX
Lec 6 bsc csit
PDF
4-Local search Artificial Intelligent Presentation
PPTX
Local Search Algorithms and Optimization problems.pptx
PPTX
Hill Climbing Algorithm Intordction.pptx
PPTX
SEARCH ALGORITHMS IN Artificial Intelligence.pptx
PPT
Lecutre 2-State Space for control system
PPTX
PDF
8.-Hill-Climbing-Algorithm in Artificial.pdf
PPTX
CS4700-LS_v6.pptx
PPTX
Back tracking and branch and bound class 20
PPT
Various Uninform Search Strategies in AI.ppt
PDF
Unit-3 (2).pdf WEAK SLOT-AND-FILLER STRUCTURES
PPT
Heuristic Search Techniques {Artificial Intelligence}
PPTX
Heuristic search
PPTX
Searching Algorithms AI and Machine Learning
PPT
Sudoku
PDF
03_UninformedSearch.pdf
PPT
Lec2 state space
PPTX
UninformedSearch (2).pptx
Lec 6 bsc csit
4-Local search Artificial Intelligent Presentation
Local Search Algorithms and Optimization problems.pptx
Hill Climbing Algorithm Intordction.pptx
SEARCH ALGORITHMS IN Artificial Intelligence.pptx
Lecutre 2-State Space for control system
8.-Hill-Climbing-Algorithm in Artificial.pdf
CS4700-LS_v6.pptx
Back tracking and branch and bound class 20
Various Uninform Search Strategies in AI.ppt
Unit-3 (2).pdf WEAK SLOT-AND-FILLER STRUCTURES
Heuristic Search Techniques {Artificial Intelligence}
Heuristic search
Searching Algorithms AI and Machine Learning
Sudoku
03_UninformedSearch.pdf
Lec2 state space
UninformedSearch (2).pptx
Ad

Recently uploaded (20)

PPTX
OOP with Java - Java Introduction (Basics)
PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PPTX
web development for engineering and engineering
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PPTX
Foundation to blockchain - A guide to Blockchain Tech
PPTX
Construction Project Organization Group 2.pptx
PPTX
Sustainable Sites - Green Building Construction
PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PPTX
bas. eng. economics group 4 presentation 1.pptx
DOCX
573137875-Attendance-Management-System-original
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PPTX
Artificial Intelligence
PPTX
Current and future trends in Computer Vision.pptx
PPTX
UNIT 4 Total Quality Management .pptx
PDF
Well-logging-methods_new................
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PPT
Project quality management in manufacturing
OOP with Java - Java Introduction (Basics)
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
web development for engineering and engineering
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
Foundation to blockchain - A guide to Blockchain Tech
Construction Project Organization Group 2.pptx
Sustainable Sites - Green Building Construction
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
UNIT-1 - COAL BASED THERMAL POWER PLANTS
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
bas. eng. economics group 4 presentation 1.pptx
573137875-Attendance-Management-System-original
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
Artificial Intelligence
Current and future trends in Computer Vision.pptx
UNIT 4 Total Quality Management .pptx
Well-logging-methods_new................
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
Project quality management in manufacturing
Ad

local-search and optimization slides.ppt

  • 1. Local Search and Optimization (Based on slides of Padhraic Smyth, Stuart Russell, Rao Kambhampati, Raj Rao, Dan Weld…) 1
  • 2. Outline • Local search techniques and optimization – Hill-climbing – Gradient methods – Simulated annealing – Genetic algorithms – Issues with local search 2
  • 3. Local search and optimization • Previous lecture: path to goal is solution to problem – systematic exploration of search space. • This lecture: a state is solution to problem – for some problems path is irrelevant. – E.g., 8-queens • Different algorithms can be used – Depth First Branch and Bound – Local search 3
  • 4. Goal Satisfaction Optimization reach the goal node Constraint satisfaction optimize(objective fn) Constraint Optimization You can go back and forth between the two problems Typically in the same complexity class © Mausam 4
  • 5. Local search and optimization • Local search – Keep track of single current state – Move only to neighboring states – Ignore paths • Advantages: – Use very little memory – Can often find reasonable solutions in large or infinite (continuous) state spaces. • “Pure optimization” problems – All states have an objective function – Goal is to find state with max (or min) objective value – Does not quite fit into path-cost/goal-state formulation – Local search can do quite well on these problems. 5
  • 6. Trivial Algorithms • Random Sampling – Generate a state randomly • Random Walk – Randomly pick a neighbor of the current state • Both algorithms asymptotically complete. © Mausam 6
  • 7. Hill-climbing (Greedy Local Search) max version function HILL-CLIMBING( problem) return a state that is a local maximum input: problem, a problem local variables: current, a node. neighbor, a node. current  MAKE-NODE(INITIAL-STATE[problem]) loop do neighbor  a highest valued successor of current if VALUE [neighbor] ≤ VALUE[current] then return STATE[current] current  neighbor min version will reverse inequalities and look for lowest valued successor 7
  • 8. Hill-climbing search • “a loop that continuously moves towards increasing value” – terminates when a peak is reached – Aka greedy local search • Value can be either – Objective function value – Heuristic function value (minimized) • Hill climbing does not look ahead of the immediate neighbors • Can randomly choose among the set of best successors – if multiple have the best value • “climbing Mount Everest in a thick fog with amnesia” 8
  • 9. “Landscape” of search Hill Climbing gets stuck in local minima depending on? 9
  • 10. Example: n-queens • Put n queens on an n x n board with no two queens on the same row, column, or diagonal • Is it a satisfaction problem or optimization? 10
  • 11. Hill-climbing search: 8-queens problem • Need to convert to an optimization problem • h = number of pairs of queens that are attacking each other • h = 17 for the above state 11
  • 12. Search Space • State – All 8 queens on the board in some configuration • Successor function – move a single queen to another square in the same column. • Example of a heuristic function h(n): – the number of pairs of queens that are attacking each other – (so we want to minimize this) 12
  • 13. Hill-climbing search: 8-queens problem • Is this a solution? • What is h? 13
  • 14. Hill-climbing on 8-queens • Randomly generated 8-queens starting states… • 14% the time it solves the problem • 86% of the time it get stuck at a local minimum • However… – Takes only 4 steps on average when it succeeds – And 3 on average when it gets stuck – (for a state space with 8^8 =~17 million states) 14
  • 15. 15 Hill Climbing Drawbacks • Local maxima • Plateaus • Diagonal ridges
  • 16. Escaping Shoulders: Sideways Move • If no downhill (uphill) moves, allow sideways moves in hope that algorithm can escape – Need to place a limit on the possible number of sideways moves to avoid infinite loops • For 8-queens – Now allow sideways moves with a limit of 100 – Raises percentage of problem instances solved from 14 to 94% – However…. • 21 steps for every successful solution • 64 for each failure 16
  • 17. Tabu Search • prevent returning quickly to the same state • Keep fixed length queue (“tabu list”) • add most recent state to queue; drop oldest • Never make the step that is currently tabu’ed • Properties: – As the size of the tabu list grows, hill-climbing will asymptotically become “non-redundant” (won’t look at the same state twice) – In practice, a reasonable sized tabu list (say 100 or so) improves the performance of hill climbing in many problems 17
  • 18. Escaping Shoulders/local Optima Enforced Hill Climbing • Perform breadth first search from a local optima – to find the next state with better h function • Typically, – prolonged periods of exhaustive search – bridged by relatively quick periods of hill-climbing • Middle ground b/w local and systematic search © Mausam 18
  • 19. Hill-climbing: stochastic variations • Stochastic hill-climbing – Random selection among the uphill moves. – The selection probability can vary with the steepness of the uphill move. • To avoid getting stuck in local minima – Random-walk hill-climbing – Random-restart hill-climbing – Hill-climbing with both 19
  • 20. Hill Climbing: stochastic variations When the state-space landscape has local minima, any search that moves only in the greedy direction cannot be complete Random walk, on the other hand, is asymptotically complete Idea: Put random walk into greedy hill-climbing 20
  • 21. Hill-climbing with random restarts • If at first you don’t succeed, try, try again! • Different variations – For each restart: run until termination vs. run for a fixed time – Run a fixed number of restarts or run indefinitely • Analysis – Say each search has probability p of success • E.g., for 8-queens, p = 0.14 with no sideways moves – Expected number of restarts? – Expected number of steps taken? • If you want to pick one local search algorithm, learn this one!! 21
  • 22. Hill-climbing with random walk • At each step do one of the two – Greedy: With prob p move to the neighbor with largest value – Random: With prob 1-p move to a random neighbor Hill-climbing with both • At each step do one of the three – Greedy: move to the neighbor with largest value – Random Walk: move to a random neighbor – Random Restart: Resample a new current state © Mausam 22
  • 23. Simulated Annealing • Simulated Annealing = physics inspired twist on random walk • Basic ideas: – like hill-climbing identify the quality of the local improvements – instead of picking the best move, pick one randomly – say the change in objective function is  – if  is positive, then move to that state – otherwise: • move to this state with probability proportional to  • thus: worse moves (very large negative ) are executed less often – however, there is always a chance of escaping from local maxima – over time, make it less likely to accept locally bad moves – (Can also make the size of the move random as well, i.e., allow “large” steps in state space) 23
  • 24. Physical Interpretation of Simulated Annealing • A Physical Analogy: • imagine letting a ball roll downhill on the function surface – this is like hill-climbing (for minimization) • now imagine shaking the surface, while the ball rolls, gradually reducing the amount of shaking – this is like simulated annealing • Annealing = physical process of cooling a liquid or metal until particles achieve a certain frozen crystal state • simulated annealing: – free variables are like particles – seek “low energy” (high quality) configuration – slowly reducing temp. T with particles moving around randomly 24
  • 25. Simulated annealing function SIMULATED-ANNEALING( problem, schedule) return a solution state input: problem, a problem schedule, a mapping from time to temperature local variables: current, a node. next, a node. T, a “temperature” controlling the prob. of downward steps current  MAKE-NODE(INITIAL-STATE[problem]) for t  1 to ∞ do T  schedule[t] if T = 0 then return current next  a randomly selected successor of current ∆E  VALUE[next] - VALUE[current] if ∆E > 0 then current  next else current  next only with probability e∆E /T 25
  • 26. Temperature T • high T: probability of “locally bad” move is higher • low T: probability of “locally bad” move is lower • typically, T is decreased as the algorithm runs longer • i.e., there is a “temperature schedule” 26
  • 27. Simulated Annealing in Practice – method proposed in 1983 by IBM researchers for solving VLSI layout problems (Kirkpatrick et al, Science, 220:671- 680, 1983). • theoretically will always find the global optimum – Other applications: Traveling salesman, Graph partitioning, Graph coloring, Scheduling, Facility Layout, Image Processing, … – useful for some problems, but can be very slow • slowness comes about because T must be decreased very gradually to retain optimality 27
  • 28. Local beam search • Idea: Keeping only one node in memory is an extreme reaction to memory problems. • Keep track of k states instead of one – Initially: k randomly selected states – Next: determine all successors of k states – If any of successors is goal  finished – Else select k best from successors and repeat 28
  • 29. Local Beam Search (contd) • Not the same as k random-start searches run in parallel! • Searches that find good states recruit other searches to join them • Problem: quite often, all k states end up on same local hill • Idea: Stochastic beam search – Choose k successors randomly, biased towards good ones • Observe the close analogy to natural selection! 29
  • 30. Hey! Perhaps sex can improve search? 30
  • 31. Sure! Check out ye book. 31
  • 32. Genetic algorithms • Twist on Local Search: successor is generated by combining two parent states • A state is represented as a string over a finite alphabet (e.g. binary) – 8-queens • State = position of 8 queens each in a column • Start with k randomly generated states (population) • Evaluation function (fitness function): – Higher values for better states. – Opposite to heuristic function, e.g., # non-attacking pairs in 8-queens • Produce the next generation of states by “simulated evolution” – Random selection – Crossover – Random mutation 32
  • 33. 8 7 6 5 4 3 2 1 String representation 16257483 Can we evolve 8-queens through genetic algorithms? 33
  • 35. Genetic algorithms • Fitness function: number of non-attacking pairs of queens (min = 0, max = 8 × 7/2 = 28) • 24/(24+23+20+11) = 31% • 23/(24+23+20+11) = 29% etc 4 states for 8-queens problem 2 pairs of 2 states randomly selected based on fitness. Random crossover points selected New states after crossover Random mutation applied 35
  • 36. Genetic algorithms Has the effect of “jumping” to a completely different new part of the search space (quite non-local) 36
  • 37. Comments on Genetic Algorithms • Genetic algorithm is a variant of “stochastic beam search” • Positive points – Random exploration can find solutions that local search can’t • (via crossover primarily) – Appealing connection to human evolution • “neural” networks, and “genetic” algorithms are metaphors! • Negative points – Large number of “tunable” parameters • Difficult to replicate performance from one problem to another – Lack of good empirical studies comparing to simpler methods – Useful on some (small?) set of problems but no convincing evidence that GAs are better than hill-climbing w/random restarts in general 37
  • 38. Optimization of Continuous Functions • Discretization – use hill-climbing • Gradient descent – make a move in the direction of the gradient • gradients: closed form or empirical 38
  • 39. 39
  • 40. Gradient Descent Assume we have a continuous function: f(x1,x2,…,xN) and we want minimize over continuous variables X1,X2,..,Xn 1. Compute the gradients for all i: f(x1,x2,…,xN) /xi 2. Take a small step downhill in the direction of the gradient: xi  xi - λf(x1,x2,…,xN) /xi 3. Repeat. • How to select λ – Line search: successively double – until f starts to increase again 40