SlideShare a Scribd company logo
Swapna.C
Motivation
 Genetic algorithms provide a learning method motivated
by an analogy to biological evolution.
 GA motivated by no. of factors :
 Evolution is known to be a successful, robust method for
adaption within biological system.
 Genetic Algorithms can search spaces of hypotheses
containing complex interacting parts, where the impact of
each parts, where the impact of each part on overall
hypothesis fitness may be difficult to model.
 Genetic algorithms are easily parallelized and can take
advantage of the decreasing costs of powerful computer
hardware.
 Genetic programming , in which entire computer
programming are 2 of the more popular approaches in a
field that is sometimes called evolutionary computation.
GENETIC ALGORITHMS
 In GAs the "best hypothesis" is defined as the one that
optimizes a predefined numerical measure for the problem
at hand, called the hypothesis Fitness.
 The algorithm operates by iteratively updating a pool of
hypotheses, called the population. On each iteration, all
members of the population are evaluated according to the
fitness function. A new population is then generated by
probabilistically selecting the most fit individuals from the
current population.
 Some of these selected individuals are carried forward into
the next generation population intact. Others are used as
the basis for creating new offspring individuals by
applying genetic operations such as crossover and
mutation.
Genetic algorithms
 In algorithm each iteration through the main loop
produces a new generation of hypotheses based on the
current population.
 1. A certain number of hypotheses from the current
population are selected for inclusion in the next
generation.
 These are selected probabilistically, where the
probability of selecting hypothesis hi is given by
 The probability that a hypothesis will be selected is
proportional to its own fitness and is inversely
proportional to the fitness of the other competing
hypotheses in the current population.
 Once these members of the current generation have
been selected for inclusion in the next generation
population, additional members are generated using a
crossover operation.
 This GA algorithm thus performs a randomized,
parallel beam search for hypotheses that perform well
according to the fitness function.
 Representing Hypotheses :
 Hypotheses in GAs are often represented by bit
strings, so that they can be easily manipulated by
genetic operators such as mutation and crossover.
 The hypotheses represented by these bit strings can be
quite complex.
 Ex: sets of if-then rules can easily be represented in
this way, by choosing an encoding of rules that
allocates specific substrings for each rule precondition
and post condition.
 To pick an example, consider the attribute Outlook, which
can take on any of the three values Sunny, Overcast, or
Rain. One obvious way to represent a constraint on
Outlook is to use a bit string of length three, in which
each bit position corresponds to one of its three possible
values. Placing a 1 in some position indicates that the
attribute is allowed to take on the corresponding value.
 For example, the string 010 represents the constraint that
Outlook must take on the second of these values, or
Outlook = Overcast. Similarly, the string 011
represents the more general constraint that allows two
possible values, or (Outlook = Overcast v Rain).
 Note 11 1 represents the most general possible constraint,
indicating that we don't care which of its possible values
the attribute takes on.
 Given this method for representing constraints on a
single attribute, conjunctions of constraints on
multiple attributes can easily be represented by
concatenating the corresponding bit strings.
 If we wish to avoid considering this hypothesis, we
may employ a different encoding alter the genetic
operators so that they explicitly avoid constructing
such bit strings, or simply assign a very low fitness to
such bit strings.
 In some GAS, hypotheses are represented by symbolic
descriptions rather than bit strings.
2.2 Genetic Operators
 The generation of successors in a GA is determined by a set
of operators that recombine and mutate selected members
of the current population.
 These operators correspond to idealized versions of the
genetic operations found in biological evolution. The two
most common operators are crossover and mutation.
 The crossover operator produces two new offspring
from two parent strings, by copying selected bits from
each parent. The bit at position i in each offspring is copied
from the bit at position i in one of the two parents. The
choice of which parent contributes the bit for position i is
determined by an additional string called the crossover
mask.
 Consider the topmost of the two offspring in this case. This
offspring takes its first five bits from the first parent and its
remaining six bits from the second parent, because the
crossover mask 11 11 1000000 specifies these choices for
each of the bit positions. The second offspring uses the
same crossover mask, but switches the roles of the two
parents. Therefore, it contains the bits that were not used
by the first offspring.
 In single-point crossover, the crossover mask is always
constructed so that it begins with a string containing n
contiguous 1’s, followed by the necessary number of 0’s to
complete the string. This results in offspring in which the
first n bits are contributed by one parent and the remaining
bits by the second parent.
Genetic algorithms
 Each time the single-point crossover operator is
applied, the crossover point n is chosen at random,
and the crossover mask is then created and
applied.
 In two-point crossover, offspring are created by
substituting intermediate segments of one parent
into the middle of the second parent string. Put
another way, the crossover mask is a string beginning
with no zeros, followed by a contiguous string of n
1 ones, followed by the necessary number of zeros
to complete the string. Each time the two-point
crossover operator is applied, a mask is generated by
randomly choosing the integers n’o and n’1.
 Uniform crossover combines bits sampled uniformly
from the two parents. The crossover mask is generated
as a random bit string with each bit chosen at random
and independent of the others.
 In addition to recombination operators that produce
offspring by combining parts of two parents, a second
type of operator produces offspring from a single
parent.
 In particular, the mutation operator produces small
random changes to the bit string by choosing a
single bit at random, then changing its value.
Mutation is often performed after crossover has been
applied.
2.3 Fitness Function and Selection
 The fitness function defines the criterion for ranking
potential hypotheses and for probabilistically selecting
them for inclusion in the next generation population.
 If the task is to learn classification rules, then the
fitness function typically has a component that scores
the classification accuracy of the rule over a set of
provided training examples. Often other criteria may
be included as well, such as the complexity or
generality of the rule.
 The probability that a hypothesis will be selected is given by the
ratio of its fitness to the fitness of other members of the current
population. This method is sometimes called fitness
proportionate selection, or roulette wheel selection.
 In tournament selection, two hypotheses are first chosen
at random from the current population. With some
predefined probability p the more fit of these two is then
selected, and with probability (1 - p) the less fit hypothesis is
selected.
 Tournament selection often yields a more diverse population
than fitness proportionate selection (Goldberg and Deb 1991).
 In another method called rank selection, the hypotheses in
the current population are first sorted by fitness. The
probability that a hypothesis will be selected is then
proportional to its rank in this sorted list, rather than its fitness.
3. AN ILLUSTRATIVE EXAMPLE
 A genetic algorithm can be viewed as a general
optimization method that searches a large space of
candidate objects seeking one that performs best according
to the fitness function. Although not guaranteed to find an
optimal object, GAS often succeed in finding an object
with high fitness.
 GAS have been applied to a number of optimization
problems outside machine learning, including problems
such as circuit layout and job-shop scheduling. Within
machine learning, they have been applied both to
function-approximation problems and to tasks such as
choosing the network topology for artificial neural network
learning systems.
 The use of GAS for concept learning, we briefly summarize
the GABIL system described by DeJong et al.
 GABIL uses a GA to learn boolean concepts represented by
a disjunctive set of propositional rules.
 In experiments over several concept learning problems,
GABIL was found to be roughly comparable in
generalization accuracy to other learning algorithms such
as the decision tree learning algorithm C4.5 and the rule
learning system AQ14.
 The learning tasks in this study included both artificial
learning tasks designed to explore the systems'
generalization accuracy and the real world problem of
breast cancer diagnosis.
 In experiments reported by DeJong et al. (1993), the
parameter r, which determines the fraction of the
parent population replaced by crossover, was set to 0.6.
The parameter m, which determines the mutation
rate, was set to 0.001.
 These are typical settings for these parameters. The
population size p was varied from 100 to 1000,
depending on the specific learning task.
 The specific instantiation of the GA algorithm in
GABIL can be summarized as follows:
 Representation. Each hypothesis in GABIL
corresponds to a disjunctive set of propositional rules,
encoded.
 The hypothesis space of rule preconditions consists of
a conjunction of constraints on a fixed set of attributes.
 To represent a set of rules, the bit-string
representations of individual rules are concatenated.
 The rule postcondition is described by a single bit that
indicates the predicted value of the target attribute c.
 Note the length of the bit string grows with the
number of rules in the hypothesis.
 Genetic operators: GABIL uses the standard
mutation operator in which a single bit is chosen at
random and replaced by its complement.
 The crossover operator that it uses is a fairly standard
extension to perform a crossover operation on two
parents, two crossover points are first chosen at
random in the first parent string. Let d1(d2) denote the
distance from the leftmost (rightmost) of these two
crossover points to the rule boundary immediately to
its left.
 For example, if the two parent strings are
and the crossover points chosen for the first parent are
the points following bit positions 1 and 8,
 where "[" and “]" indicate crossover points, then d1 = 1
and d2 = 3. Hence the allowed pairs of crossover points
for the second parent include the pairs of bit positions
(1,3), (1,8), and (6,8). If the pair (1,3) happens to be
chosen,
 Chosen,
 Then the two resulting offspring will be
 As this example illustrates, this crossover operation enables
offspring to
 contain a different number of rules than their parents, while
assuring that all bit strings generated in this fashion represent
well-defined rule sets.
 Fitness function. The fitness of each hypothesized rule
set is based on its classification accuracy over the training
data. In particular, the function used to measure fitness is
 where correct (h) is the percent of all training
examples correctly classified by hypothesis h.
 In experiments comparing the behavior of GABIL to
decision tree learning algorithms such as C4.5 and ID5R,
and to the rule learning algorithm AQ14,DeJong et al.
(1993) report roughly comparable performance among
these systems, tested on a variety of learning problems.
 For example, over a set of 12 synthetic problems, GABIL
achieved an average generalization accuracy of 92.1 %,
where as the performance of the other systems ranged from
91.2 % to 96.6 %.
 3.1 Extensions:
 DeJong et al. (1993) also explore two interesting
extensions to the basic design of GABIL. In one set of
experiments they explored the addition of two new
genetic operators that were motivated by the
generalization operators common in many symbolic
learning methods.
 The first of these operators, AddAlternative,
generalizes the constraint on a specific attribute by
changing a 0 to a 1 in the substring corresponding to
the attribute.
 Dropcondition performs a more drastic
generalization step, by replacing all bits for a particular
attribute by a 1.This operator corresponds to
generalizing the rule by completely dropping the
constraint on the attribute, and was applied on each
generation with probability.60.
 In the above experiment, the two new operators were
applied with the same probability to each hypothesis
in the population on each generation.
 In a second experiment, the bit-string representation
for hypotheses was extended to include two bits that
determine which of these operators may be applied to
the hypothesis.
 In this extended representation, the bit string for a
typical rule set hypothesis would be
 where the final two bits indicate in this case that the
AddAlternative operator may be applied to this bit
string, but that the Dropcondition operator may
not. These two new bits define part of the search
strategy used by the GA and are themselves altered
and evolved using the same crossover and mutation
operators that operate on other bits in the string.
4 HYPOTHESIS SPACE SEARCH
 The GA search can move much more abruptly, replacing a
parent hypothesis by an offspring that may be radically
different from the parent. Note the GA search is therefore
less likely to fall into the same kind of local minima that
can plague gradient descent methods.
 One practical difficulty in some GA applications is the
problem of crowding.
 Crowding is a phenomenon in which some individual that
is more highly fit than others in the population quickly
reproduces, so that copies of this individual and 1 very
similar individuals take over a large fraction of the
population.
 The negative impact of crowding is that it reduces the diversity
of the population, thereby slowing further progress by the GA.
 Several strategies have been explored for reducing crowding.
One approach is to alter the selection function, using criteria
such as tournament selection or rank selection in place of fitness
proportionate roulette wheel selection. A related strategy is
"fitness sharing," in which the measured fitness of an individual
is reduced by the presence of other, similar individuals in the
population.
 A third approach is to restrict the kinds of individuals allowed
to recombine to form offspring.
 For example, by allowing only the most similar individuals to
recombine, we can encourage the formation of clusters of similar
individuals, or multiple "subspecies" within the population. A
related approach is to spatially distribute individuals and allow
only nearby individuals to recombine.
 Many of these techniques are inspired by the analogy to
biological evolution.
4.1 Population Evolution and the
Schema Theorem
 A schema is any string composed of Os, Is, and *'s. Each
schema represents the set of bit strings containing the
indicated 0’s and 1’s, with each "*" interpreted as a "don't
care."
 For example, the schema 0*10 represents the set of bit
strings that includes exactly 0010 and 01 10.
 The schema theorem characterizes the evolution of the
population within a GA in terms of the number of
instances representing each schema. Let m(s, t) denote the
number of instances of schema s in the population at
time t (i.e., during the t th generation). The schema
theorem describes the expected value of m(s, t + 1) in
terms of m(s, t) and other properties of the schema,
population, and GA algorithm parameters.

More Related Content

PPTX
Evolutionary computing - soft computing
PPTX
Learning rule of first order rules
PPTX
Genetic Algorithm
PPTX
Minmax and alpha beta pruning.pptx
PPTX
Genetic programming
PPTX
Genetic algorithm
PPTX
ML_ Unit_1_PART_A
Evolutionary computing - soft computing
Learning rule of first order rules
Genetic Algorithm
Minmax and alpha beta pruning.pptx
Genetic programming
Genetic algorithm
ML_ Unit_1_PART_A

What's hot (20)

PPTX
Analytical learning
PPT
Instance Based Learning in Machine Learning
PPTX
Using prior knowledge to initialize the hypothesis,kbann
PPTX
Means End Analysis (MEA) in Artificial.pptx
PDF
Bayes Belief Networks
PPTX
Control Strategies in AI
PPTX
Heap Management
PDF
Production System in AI
PPTX
Text similarity measures
PPTX
Logics for non monotonic reasoning-ai
PPTX
Combining inductive and analytical learning
PPTX
Module 4 part_1
PPTX
Message and Stream Oriented Communication
PDF
Introduction to Machine Learning Classifiers
PPT
Problems, Problem spaces and Search
PPT
Heuristic Search Techniques Unit -II.ppt
PDF
Expert systems Artificial Intelligence
PPTX
Uncertainty in AI
PPTX
knowledge representation using rules
PPT
Learning sets of rules, Sequential Learning Algorithm,FOIL
Analytical learning
Instance Based Learning in Machine Learning
Using prior knowledge to initialize the hypothesis,kbann
Means End Analysis (MEA) in Artificial.pptx
Bayes Belief Networks
Control Strategies in AI
Heap Management
Production System in AI
Text similarity measures
Logics for non monotonic reasoning-ai
Combining inductive and analytical learning
Module 4 part_1
Message and Stream Oriented Communication
Introduction to Machine Learning Classifiers
Problems, Problem spaces and Search
Heuristic Search Techniques Unit -II.ppt
Expert systems Artificial Intelligence
Uncertainty in AI
knowledge representation using rules
Learning sets of rules, Sequential Learning Algorithm,FOIL
Ad

Similar to Genetic algorithms (20)

PPTX
PPTX
Genetic algorithm
PPT
Genetic Algorithms-1.ppt
PPT
Group 9 genetic-algorithms (1)
PDF
A Review On Genetic Algorithm And Its Applications
PPTX
Genetic Algorithm
PPT
Genetic algorithms
PPT
Genetic algorithm
PPT
Genetic algorithms full lecture
PPTX
Genetic algorithm (ga) binary and real Vijay Bhaskar Semwal
PDF
Genetic Algorithms in Artificial Intelligence
PPTX
MACHINE LEARNING - GENETIC ALGORITHM
PDF
RM 701 Genetic Algorithm and Fuzzy Logic lecture
PPT
Genetic-Algorithms SDSDa SDD dfsAFF fsaf
PDF
An Efficient Genetic Algorithm for Solving Knapsack Problem.pdf
PPT
Genetic algorithms
PDF
CI_L02_Optimization_ag2_eng.pdf
PPTX
2020 6 16_ga_introduction
PPT
Genetic-Algorithms for engineering appl.ppt
PDF
CSA 3702 machine learning module 4
Genetic algorithm
Genetic Algorithms-1.ppt
Group 9 genetic-algorithms (1)
A Review On Genetic Algorithm And Its Applications
Genetic Algorithm
Genetic algorithms
Genetic algorithm
Genetic algorithms full lecture
Genetic algorithm (ga) binary and real Vijay Bhaskar Semwal
Genetic Algorithms in Artificial Intelligence
MACHINE LEARNING - GENETIC ALGORITHM
RM 701 Genetic Algorithm and Fuzzy Logic lecture
Genetic-Algorithms SDSDa SDD dfsAFF fsaf
An Efficient Genetic Algorithm for Solving Knapsack Problem.pdf
Genetic algorithms
CI_L02_Optimization_ag2_eng.pdf
2020 6 16_ga_introduction
Genetic-Algorithms for engineering appl.ppt
CSA 3702 machine learning module 4
Ad

More from swapnac12 (17)

PPTX
Awt, Swing, Layout managers
PPTX
Applet
PPTX
Event handling
PPTX
Asymptotic notations(Big O, Omega, Theta )
PPTX
Performance analysis(Time & Space Complexity)
PPTX
Introduction ,characteristics, properties,pseudo code conventions
PPTX
Inductive analytical approaches to learning
PPTX
Learning set of rules
PPTX
Instance based learning
PPTX
Computational learning theory
PPTX
Multilayer & Back propagation algorithm
PPTX
Artificial Neural Networks 1
PPTX
Evaluating hypothesis
PPTX
Advanced topics in artificial neural networks
PPTX
Introdution and designing a learning system
PPTX
Inductive bias
PPTX
Concept learning and candidate elimination algorithm
Awt, Swing, Layout managers
Applet
Event handling
Asymptotic notations(Big O, Omega, Theta )
Performance analysis(Time & Space Complexity)
Introduction ,characteristics, properties,pseudo code conventions
Inductive analytical approaches to learning
Learning set of rules
Instance based learning
Computational learning theory
Multilayer & Back propagation algorithm
Artificial Neural Networks 1
Evaluating hypothesis
Advanced topics in artificial neural networks
Introdution and designing a learning system
Inductive bias
Concept learning and candidate elimination algorithm

Recently uploaded (20)

PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PPTX
Lesson notes of climatology university.
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PDF
Sports Quiz easy sports quiz sports quiz
PPTX
master seminar digital applications in india
PDF
O7-L3 Supply Chain Operations - ICLT Program
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
01-Introduction-to-Information-Management.pdf
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PPTX
Institutional Correction lecture only . . .
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PPTX
GDM (1) (1).pptx small presentation for students
PPTX
PPH.pptx obstetrics and gynecology in nursing
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PPTX
Renaissance Architecture: A Journey from Faith to Humanism
O5-L3 Freight Transport Ops (International) V1.pdf
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
Lesson notes of climatology university.
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
Sports Quiz easy sports quiz sports quiz
master seminar digital applications in india
O7-L3 Supply Chain Operations - ICLT Program
STATICS OF THE RIGID BODIES Hibbelers.pdf
Final Presentation General Medicine 03-08-2024.pptx
01-Introduction-to-Information-Management.pdf
Abdominal Access Techniques with Prof. Dr. R K Mishra
2.FourierTransform-ShortQuestionswithAnswers.pdf
Institutional Correction lecture only . . .
FourierSeries-QuestionsWithAnswers(Part-A).pdf
GDM (1) (1).pptx small presentation for students
PPH.pptx obstetrics and gynecology in nursing
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
Renaissance Architecture: A Journey from Faith to Humanism

Genetic algorithms

  • 2. Motivation  Genetic algorithms provide a learning method motivated by an analogy to biological evolution.  GA motivated by no. of factors :  Evolution is known to be a successful, robust method for adaption within biological system.  Genetic Algorithms can search spaces of hypotheses containing complex interacting parts, where the impact of each parts, where the impact of each part on overall hypothesis fitness may be difficult to model.  Genetic algorithms are easily parallelized and can take advantage of the decreasing costs of powerful computer hardware.  Genetic programming , in which entire computer programming are 2 of the more popular approaches in a field that is sometimes called evolutionary computation.
  • 3. GENETIC ALGORITHMS  In GAs the "best hypothesis" is defined as the one that optimizes a predefined numerical measure for the problem at hand, called the hypothesis Fitness.  The algorithm operates by iteratively updating a pool of hypotheses, called the population. On each iteration, all members of the population are evaluated according to the fitness function. A new population is then generated by probabilistically selecting the most fit individuals from the current population.  Some of these selected individuals are carried forward into the next generation population intact. Others are used as the basis for creating new offspring individuals by applying genetic operations such as crossover and mutation.
  • 5.  In algorithm each iteration through the main loop produces a new generation of hypotheses based on the current population.  1. A certain number of hypotheses from the current population are selected for inclusion in the next generation.  These are selected probabilistically, where the probability of selecting hypothesis hi is given by
  • 6.  The probability that a hypothesis will be selected is proportional to its own fitness and is inversely proportional to the fitness of the other competing hypotheses in the current population.  Once these members of the current generation have been selected for inclusion in the next generation population, additional members are generated using a crossover operation.  This GA algorithm thus performs a randomized, parallel beam search for hypotheses that perform well according to the fitness function.
  • 7.  Representing Hypotheses :  Hypotheses in GAs are often represented by bit strings, so that they can be easily manipulated by genetic operators such as mutation and crossover.  The hypotheses represented by these bit strings can be quite complex.  Ex: sets of if-then rules can easily be represented in this way, by choosing an encoding of rules that allocates specific substrings for each rule precondition and post condition.
  • 8.  To pick an example, consider the attribute Outlook, which can take on any of the three values Sunny, Overcast, or Rain. One obvious way to represent a constraint on Outlook is to use a bit string of length three, in which each bit position corresponds to one of its three possible values. Placing a 1 in some position indicates that the attribute is allowed to take on the corresponding value.  For example, the string 010 represents the constraint that Outlook must take on the second of these values, or Outlook = Overcast. Similarly, the string 011 represents the more general constraint that allows two possible values, or (Outlook = Overcast v Rain).  Note 11 1 represents the most general possible constraint, indicating that we don't care which of its possible values the attribute takes on.
  • 9.  Given this method for representing constraints on a single attribute, conjunctions of constraints on multiple attributes can easily be represented by concatenating the corresponding bit strings.  If we wish to avoid considering this hypothesis, we may employ a different encoding alter the genetic operators so that they explicitly avoid constructing such bit strings, or simply assign a very low fitness to such bit strings.  In some GAS, hypotheses are represented by symbolic descriptions rather than bit strings.
  • 10. 2.2 Genetic Operators  The generation of successors in a GA is determined by a set of operators that recombine and mutate selected members of the current population.  These operators correspond to idealized versions of the genetic operations found in biological evolution. The two most common operators are crossover and mutation.  The crossover operator produces two new offspring from two parent strings, by copying selected bits from each parent. The bit at position i in each offspring is copied from the bit at position i in one of the two parents. The choice of which parent contributes the bit for position i is determined by an additional string called the crossover mask.
  • 11.  Consider the topmost of the two offspring in this case. This offspring takes its first five bits from the first parent and its remaining six bits from the second parent, because the crossover mask 11 11 1000000 specifies these choices for each of the bit positions. The second offspring uses the same crossover mask, but switches the roles of the two parents. Therefore, it contains the bits that were not used by the first offspring.  In single-point crossover, the crossover mask is always constructed so that it begins with a string containing n contiguous 1’s, followed by the necessary number of 0’s to complete the string. This results in offspring in which the first n bits are contributed by one parent and the remaining bits by the second parent.
  • 13.  Each time the single-point crossover operator is applied, the crossover point n is chosen at random, and the crossover mask is then created and applied.  In two-point crossover, offspring are created by substituting intermediate segments of one parent into the middle of the second parent string. Put another way, the crossover mask is a string beginning with no zeros, followed by a contiguous string of n 1 ones, followed by the necessary number of zeros to complete the string. Each time the two-point crossover operator is applied, a mask is generated by randomly choosing the integers n’o and n’1.
  • 14.  Uniform crossover combines bits sampled uniformly from the two parents. The crossover mask is generated as a random bit string with each bit chosen at random and independent of the others.  In addition to recombination operators that produce offspring by combining parts of two parents, a second type of operator produces offspring from a single parent.  In particular, the mutation operator produces small random changes to the bit string by choosing a single bit at random, then changing its value. Mutation is often performed after crossover has been applied.
  • 15. 2.3 Fitness Function and Selection  The fitness function defines the criterion for ranking potential hypotheses and for probabilistically selecting them for inclusion in the next generation population.  If the task is to learn classification rules, then the fitness function typically has a component that scores the classification accuracy of the rule over a set of provided training examples. Often other criteria may be included as well, such as the complexity or generality of the rule.
  • 16.  The probability that a hypothesis will be selected is given by the ratio of its fitness to the fitness of other members of the current population. This method is sometimes called fitness proportionate selection, or roulette wheel selection.  In tournament selection, two hypotheses are first chosen at random from the current population. With some predefined probability p the more fit of these two is then selected, and with probability (1 - p) the less fit hypothesis is selected.  Tournament selection often yields a more diverse population than fitness proportionate selection (Goldberg and Deb 1991).  In another method called rank selection, the hypotheses in the current population are first sorted by fitness. The probability that a hypothesis will be selected is then proportional to its rank in this sorted list, rather than its fitness.
  • 17. 3. AN ILLUSTRATIVE EXAMPLE  A genetic algorithm can be viewed as a general optimization method that searches a large space of candidate objects seeking one that performs best according to the fitness function. Although not guaranteed to find an optimal object, GAS often succeed in finding an object with high fitness.  GAS have been applied to a number of optimization problems outside machine learning, including problems such as circuit layout and job-shop scheduling. Within machine learning, they have been applied both to function-approximation problems and to tasks such as choosing the network topology for artificial neural network learning systems.
  • 18.  The use of GAS for concept learning, we briefly summarize the GABIL system described by DeJong et al.  GABIL uses a GA to learn boolean concepts represented by a disjunctive set of propositional rules.  In experiments over several concept learning problems, GABIL was found to be roughly comparable in generalization accuracy to other learning algorithms such as the decision tree learning algorithm C4.5 and the rule learning system AQ14.  The learning tasks in this study included both artificial learning tasks designed to explore the systems' generalization accuracy and the real world problem of breast cancer diagnosis.
  • 19.  In experiments reported by DeJong et al. (1993), the parameter r, which determines the fraction of the parent population replaced by crossover, was set to 0.6. The parameter m, which determines the mutation rate, was set to 0.001.  These are typical settings for these parameters. The population size p was varied from 100 to 1000, depending on the specific learning task.
  • 20.  The specific instantiation of the GA algorithm in GABIL can be summarized as follows:  Representation. Each hypothesis in GABIL corresponds to a disjunctive set of propositional rules, encoded.  The hypothesis space of rule preconditions consists of a conjunction of constraints on a fixed set of attributes.  To represent a set of rules, the bit-string representations of individual rules are concatenated.  The rule postcondition is described by a single bit that indicates the predicted value of the target attribute c.  Note the length of the bit string grows with the number of rules in the hypothesis.
  • 21.  Genetic operators: GABIL uses the standard mutation operator in which a single bit is chosen at random and replaced by its complement.  The crossover operator that it uses is a fairly standard extension to perform a crossover operation on two parents, two crossover points are first chosen at random in the first parent string. Let d1(d2) denote the distance from the leftmost (rightmost) of these two crossover points to the rule boundary immediately to its left.
  • 22.  For example, if the two parent strings are and the crossover points chosen for the first parent are the points following bit positions 1 and 8,  where "[" and “]" indicate crossover points, then d1 = 1 and d2 = 3. Hence the allowed pairs of crossover points for the second parent include the pairs of bit positions (1,3), (1,8), and (6,8). If the pair (1,3) happens to be chosen,
  • 23.  Chosen,  Then the two resulting offspring will be  As this example illustrates, this crossover operation enables offspring to  contain a different number of rules than their parents, while assuring that all bit strings generated in this fashion represent well-defined rule sets.
  • 24.  Fitness function. The fitness of each hypothesized rule set is based on its classification accuracy over the training data. In particular, the function used to measure fitness is  where correct (h) is the percent of all training examples correctly classified by hypothesis h.  In experiments comparing the behavior of GABIL to decision tree learning algorithms such as C4.5 and ID5R, and to the rule learning algorithm AQ14,DeJong et al. (1993) report roughly comparable performance among these systems, tested on a variety of learning problems.  For example, over a set of 12 synthetic problems, GABIL achieved an average generalization accuracy of 92.1 %, where as the performance of the other systems ranged from 91.2 % to 96.6 %.
  • 25.  3.1 Extensions:  DeJong et al. (1993) also explore two interesting extensions to the basic design of GABIL. In one set of experiments they explored the addition of two new genetic operators that were motivated by the generalization operators common in many symbolic learning methods.  The first of these operators, AddAlternative, generalizes the constraint on a specific attribute by changing a 0 to a 1 in the substring corresponding to the attribute.
  • 26.  Dropcondition performs a more drastic generalization step, by replacing all bits for a particular attribute by a 1.This operator corresponds to generalizing the rule by completely dropping the constraint on the attribute, and was applied on each generation with probability.60.  In the above experiment, the two new operators were applied with the same probability to each hypothesis in the population on each generation.  In a second experiment, the bit-string representation for hypotheses was extended to include two bits that determine which of these operators may be applied to the hypothesis.
  • 27.  In this extended representation, the bit string for a typical rule set hypothesis would be  where the final two bits indicate in this case that the AddAlternative operator may be applied to this bit string, but that the Dropcondition operator may not. These two new bits define part of the search strategy used by the GA and are themselves altered and evolved using the same crossover and mutation operators that operate on other bits in the string.
  • 28. 4 HYPOTHESIS SPACE SEARCH  The GA search can move much more abruptly, replacing a parent hypothesis by an offspring that may be radically different from the parent. Note the GA search is therefore less likely to fall into the same kind of local minima that can plague gradient descent methods.  One practical difficulty in some GA applications is the problem of crowding.  Crowding is a phenomenon in which some individual that is more highly fit than others in the population quickly reproduces, so that copies of this individual and 1 very similar individuals take over a large fraction of the population.
  • 29.  The negative impact of crowding is that it reduces the diversity of the population, thereby slowing further progress by the GA.  Several strategies have been explored for reducing crowding. One approach is to alter the selection function, using criteria such as tournament selection or rank selection in place of fitness proportionate roulette wheel selection. A related strategy is "fitness sharing," in which the measured fitness of an individual is reduced by the presence of other, similar individuals in the population.  A third approach is to restrict the kinds of individuals allowed to recombine to form offspring.  For example, by allowing only the most similar individuals to recombine, we can encourage the formation of clusters of similar individuals, or multiple "subspecies" within the population. A related approach is to spatially distribute individuals and allow only nearby individuals to recombine.  Many of these techniques are inspired by the analogy to biological evolution.
  • 30. 4.1 Population Evolution and the Schema Theorem  A schema is any string composed of Os, Is, and *'s. Each schema represents the set of bit strings containing the indicated 0’s and 1’s, with each "*" interpreted as a "don't care."  For example, the schema 0*10 represents the set of bit strings that includes exactly 0010 and 01 10.  The schema theorem characterizes the evolution of the population within a GA in terms of the number of instances representing each schema. Let m(s, t) denote the number of instances of schema s in the population at time t (i.e., during the t th generation). The schema theorem describes the expected value of m(s, t + 1) in terms of m(s, t) and other properties of the schema, population, and GA algorithm parameters.