SlideShare a Scribd company logo
Genetic algorithm with
dynamic population for solving
the simultaneous optimization
of multiple query orders
Mahyar Teymournezhad
m_teimoornezhad@yahoo.com
Abstract
The purpose of optimizing several query orders is to find executive
designs that minimize the total cost of executing queries using these
schemes. Each query can have multiple designs individually. Since each
project consists of a series of tasks, the MQO's goal is to find plans
sharing the most tasks with other schemes. In the general case, this is
one of the NP – Complete issues. So far, different methods have been
proposed for this problem. In this paper, the problem of optimizing
multiple query orders is solved using a dynamic population genetic
algorithm. The results show that the proposed method has lower
implementation time and more convergence speed than existing methods.
Key Words : Multi-query optimization, Genetic Algorithm, Link Sequence
I. INTRODUCTION
One of the most important and costly parts of the database is the optimization
of query orders.
In Section 3, the modeling of the MQO problem using the genetic algorithm is
being investigated.
If multiple query orders are requested simultaneously from the database,
obtaining an execution plan with minimal cost of executing these orders is a
database optimizer task.
Using the formulation [4], the second phase has been studied independently of
the first phase.
Usually, the second phase of MQO is the most time consuming phase in the
problem.
But to solve multiple query orders simultaneously and identify their shared
tasks, it is necessary for the entire set of execution plans for these queries to
run together, because costly design tasks may lead to more sharing with other
queries and cause getting a better solution to the MQO problem.
One of the most famous exploratory techniques used
to optimize complex problems is the genetic algorithm.
A definition of it is given in [8], the genetic algorithm is
used to solve many of the NP – Complete issues.
Goldberg has shown in [4] the practicality of the
genetic algorithm by presenting a summary of
applications of the genetic algorithm. The genetic
algorithm simulates the evolutionary concepts of
biology. This simulation involves probabilistic methods
using evolutionary principles [1]. In the genetic
algorithm, the original data structure is a vector of
genes (called the chromosome). Each chromosome
representing a sample is a solution to the problem. The
chromosome members (called genes) are part of the
problem solution. The quality of the solution sample
(ie, a chromosome) is defined by being close to the
optimal solution (which is called the fitness function).
II. A review of the genetic algorithm
The genetic algorithm searches for an optimal solution
using evolutionary operators (also called genetic
operators). Initially, with a randomly weak state,
chromosomes are produced to display a variety of
solutions. Then, genetic operators apply to weak
chromosomes and produce new chromosomes for the
next stage.
II. A review of the genetic algorithm
• The three operators used in the genetic algorithm are
as follows:
• Intersection operator: In the intersection operator,
a part of the parent's chromosome changes to make
the child's chromosome.
• Mutation Operator: New chromosomes are
produced by random modification of a small number
of genes in the chromosome. The mutation operator
will never apply to the best solution in the
population.
• Selection Operator: This operator determines
which chromosomes should survive for the next
generation.
II. A review of the genetic algorithm
• The simplest way can be to determine that better
chromosomes have a greater chance of surviving in
the next generation. This is what really happens in
evolutionary processes. However, sometimes
applying an intersection or mutation operator on an
inappropriate chromosome may produce the
appropriate chromosome [3].
II. A review of the genetic algorithm
The most commonly used selection techniques are:
• Cutting Method: First, all chromosomes are arranged in descending
order (from the best to the worst) based on the number assigned to the
fitness function. The n chromosomes above this list are then
transmitted to the next generation with the same probability.
• Race method: R is randomly selected. Then r chromosome is
selected from the population and the chromosome with the best value
(runtime) is transmitted to the next generation. This process continues
until the proper amount has been reached for the next generation. In
this method, a chromosome may be selected several times.
• Fortune Wheel Method: In this method, the chromosomes are
placed on the parts of the circle according to their fitness. The more
chromosomes inside a part have a better fit, the greater the area. Then
a random number is generated and the chromosomes of the part
corresponding to that random number are transmitted to the next
generation. Steinbrunn [9] has used this method.
II. A review of the genetic algorithm
• The genetic algorithm can easily model the MQO
problem. Each chromosome is a solution to this problem.
Each gene in the chromosomes represents a plan for the
corresponding query order.
• Each Ci chromosome is composed of a number of Gj
genes. Each gene is a solution to a query order. In a
generation, the number of chromosomes varies depending
on the number of population associated with that Pk.
• Selection Operator Σ: Takes the population of a
generation and selects some of them to be transferred to
the next generation.
• Mutation Operator (M): It takes a chromosome as input
and creates a new chromosome.
III. Modeling of Genetic Algorithm for MQO
To select a number of chromosomes for the next generation, the quality
of the chromosomes (which is determined by the fitness function) is
considered. A simple choice for the fitness function is to reverse the
entire execution time of the query order task.
Intersection and mutation operators can also be easily defined for the
MQO problem. These operators create new and valid solutions. Since the
genes of a chromosome represent a selective plan for the query order for
that gene, replacing a plan with the current plan creates a new and valid
solution. This is done by the mutation operator.
For the intersection operator, different types can be considered: single-
point, multi-point, and segment. In the proposed method, all these
techniques create valid solutions. If two chromosomes represent two
valid solutions for the MQO problem, each intersecting operator on these
two chromosomes will create new and valid solutions. Regardless of the
type of intersection operator and the location of doing it, since all the
pieces that move in this operator represent valid solutions for the
corresponding query instructions, the new solutions will also be valid.
III. Modeling of Genetic Algorithm for MQO
This paper presents a method in which the size of the population varies
in each generation. So far, in all the classical methods, the population
selection stage is considered constant. By doing this, the algorithm will
be simpler, but an artificial limitation will be created and will not follow
the natural genetic law in biology. Because the size of the population is
constantly changing. One of the drawbacks of the revelation methods is
that the algorithm stops at the local minimum and is also costly in
computing. On the other hand, the phenomenon of congestion is also one
of the destructive factors in the quality of the genetic algorithm [3].
When enough resources are available and fairly good solutions are
available, the size of the population increases, and the size of the
population decreases when the number of appropriate solutions in a
generation is low. At first glance, the proposed method may seem not to
be very effective, but experiments have shown that using this method
improves the accuracy and speed of the implementation of the genetic
algorithm.
IV. Providing a solution offer:
Genetic algorithm with dynamic population size (DP – GA)
Although in some generations the algorithm may increase
the population and thus increase the amount of computing,
and cause the algorithm to slow down, the choice of an
appropriate threshold value for how the population changes
will reduce overall run-time. One of the most important
parameters in this method is how to resize the population. To
do so, first, the problem is solved by the greedy method.
This is due to the speed of the implementation of the greedy
algorithm. The resulting number is used by the greedy
algorithm as a threshold value. In the next stage, considering
the difference of the best answer in each generation with the
answer given by the greedy algorithm, the population of the
next generation is determined.
IV. Providing a solution offer:
Genetic algorithm with dynamic population size (DP – GA)
In this paper, the population of the new generation is
calculated as follows:
In this formula is the next-gen population, is the
current generation population, is the time calculated by
the greedy algorithm and is the time of
execution of the best solution in the current generation.
IV. Providing a solution offer:
Genetic algorithm with dynamic population size (DP – GA)
In this section, the experimental results of comparing the
genetic algorithm with constant population and proposed
genetic algorithm with dynamic population are presented.
Experiments are performed on computers with 2.2 GHz and
2G main memory. The language of the implementation of
algorithms is C.
V. The experimental results and analysis
• To generate input, the parameters of Table 1 are used and
are as follows:
• At first tasks are generated randomly. To produce them, the
two parameters of the number of tasks (T) and the lowest
and the highest amount for the time of execution of tasks are
used. Initially “T” tasks are produced (ie. “T” is the number
of tasks). Then, for each generated task, a number is
allocated in the MinET and MaxET intervals as runtime.
After the tasks are created, they are distributed among the
plans. For this purpose, the parameters MinP and MaxP are
used. Although there may be many plans for using the tasks
in share, it is avoided to create two exactly identical plans.
Finally, query instructions are created. To generate them, the
parameter of the number of query orders (Q), MinQ and
MaxQ are used.
V. The experimental results and analysis
• Each query order has its own set of plans for execution.
Therefore, a particular scheme does not solve more than one
query command. At the same time, since each query order
consists of a plan and each plan consists of tasks, these tasks
can be shared between query orders.
TABLE I: The values used to simulate the MQO problem with genetic algorithm
V. The experimental results and analysis
Parameter name Amount
Primary population size 100
Iteration count (number of generations) 10
Mutation rate %1
To calculate the input size, the average values (MaxP, MinP)
and (MaxQ, MinQ) are considered, so the input size will be
equal to:
In fact, the input size is equal to the size of the MQO
problem search space. This number does not correlate with
the optimal solution value. Although a query order can have
several executable plans, only one of these plans will appear
for each query prompt in the final answer. Therefore, the
optimal solution value is related only to the number of query
orders, the number of tasks in each plan, and the execution
time of the tasks.
V. The experimental results and analysis
To compare the exploratory algorithms that are presented to solve
NP – Complete problems, there are usually two aspects to
consider: The algorithm execution time and the difference in the
solution obtained by the algorithm with the optimal response.
Bayir in [1] explains nine different modes of genetic algorithm
and has shown that for the problem of optimizing several query
orders, the use of the shear selection operator and the initial
population size operator is used only in the first execution of the
algorithm. In future generations, this amount will be less or more
depending on the best answer of every generation. The rate of
mutation states that in each generation of the algorithm, several
percent of the chromosomes of that generation will mutate. In the
following, we use the phrase "time to execute the chromosome" as
the time of execution of tasks that are given as the input of the
MQO problem to the genetic algorithm.
V. The experimental results and analysis
The results are shown in Figures 1 and 2. The GA_DP algorithm
shows more precision in finding the optimal answer. The reason
for this is using a variable population that allows the algorithm to
pass through the local minimum and search for more space. It is
also shown in Figure 2 that the speed of the proposed algorithm is
also increased. In cases where the execution time of the
chromosomes of a generation has a significant difference with the
optimal solution, the size of the population decreases. This
decrease in population causes an increase in the speed of the
algorithm and, as noted in Section 4, increases the convergence
rate to the answer in these cases.
V. The experimental results and analysis
V. The experimental results and analysis
250 350 450 550 650 750
Figure 1. Response time chart
V. The experimental results and analysis
250 350 450 550 650 750
Figure 2. Runtime graph
VI. CONCLUSION
In this paper, a solution based on a genetic algorithm with variable population
was presented. One of the main drawbacks of the genetic algorithm is the
timing of computing and getting caught up in local extremums. By the
proposed method, although in some cases the amount of computing in one
generation of the algorithm may be increased, overall, the amount of
computations is reduced and, as a result, the algorithm's speed will be
increased. Also, due to the population growth in the conditions mentioned in
the article, this allows a proposed algorithm to pass through local minimums
and has a faster convergence rate than the base genetic algorithm. The
disadvantage of this method is that we need to set the initial parameters and
the appropriate threshold for the rate of population change. The greedy
method was used as threshold value in this paper. It seems that using random
methods such as random walk and recovery iteration are more accurate
estimations to determine this amount.
REFERENCES
[1] Murat Ali Bayir, Ismail H. Toroslu, and Ahmet Cosar, Genetic Algorithm for the Multiple-Query Optimization
Problem, 2007, IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS
AND REVIEWS, VOL. 37, NO. 1, JANUARY 2007
[2] Guido Moerkotte, Building Query Compilers, Page [375- 385], 2006
[3] Tom M. Mitchell, Machine Learning, page [250-270] 1997
[4] D. E. Goldberg, Genetic Algorithms in Search, Optimization and Machine Learning.
Reading, MA: Addison-Wesley, 1989
[5] T. Sellis, “Multiple query optimization,” ACM Trans. Database Syst.,vol. 13, no. 1, pp. 23–
52, 1988.
[6] A. Cosar, E. P. Lim, and J. Srivastava, “Multiple query optimization with depth-first branch-and-bound and
dynamic query ordering,” in Proc.CIKM 93, 1993, pp. 433–438.
[7] K. Shim, T. Sellis, and D. Nau, “Improvements on a heuristic algorithm for multiple-query
optimization,” Data Knowl. Eng., vol. 12, no. 2, pp. 197– 222, 1994.
[8] J. H. Holland, Adaptation in Natural and Artificial Systems. Ann Arbor,MI: Univ. Michigan
Press, 1975.
[9] Michael Steinbrunn, Guido Moerkotte, Alfons Kemper, Heuristic and randomized
optimization for the join ordering problem,The VLDB Journal(1997)6:191–208

More Related Content

PDF
E034023028
PDF
Genetic algorithm
PDF
Software Testing Using Genetic Algorithms
PDF
L018147377
PDF
Genetic algorithm fitness function
PDF
Biology-Derived Algorithms in Engineering Optimization
PDF
A Genetic Algorithm on Optimization Test Functions
PPTX
Fuzzy Genetic Algorithm
E034023028
Genetic algorithm
Software Testing Using Genetic Algorithms
L018147377
Genetic algorithm fitness function
Biology-Derived Algorithms in Engineering Optimization
A Genetic Algorithm on Optimization Test Functions
Fuzzy Genetic Algorithm

What's hot (17)

PDF
Artificial Intelligence in Robot Path Planning
PDF
Genetic Approach to Parallel Scheduling
PDF
EVOLUTIONARY COMPUTING TECHNIQUES FOR SOFTWARE EFFORT ESTIMATION
PPTX
Analysis of Parameter using Fuzzy Genetic Algorithm in E-learning System
PDF
Modified Bully Algorithm Incorporating the Concept of Election Commissio...
PDF
Genetic Algorithms
PDF
B017410916
DOCX
IEEE 2014 JAVA DATA MINING PROJECTS A fast clustering based feature subset se...
PDF
40120130405011
PDF
Scheduling Courses Using Genetic Algorithms
PDF
Application of Genetic Algorithm and Particle Swarm Optimization in Software ...
PPT
An interactive approach to multiobjective clustering of gene expression patterns
PDF
Integrated bio-search approaches with multi-objective algorithms for optimiza...
PDF
A Novel Hybrid Voter Using Genetic Algorithm and Performance History
DOCX
A fast clustering based feature subset selection algorithm for high-dimension...
PDF
SURVEY ON MODELLING METHODS APPLICABLE TO GENE REGULATORY NETWORK
PDF
Comparison between the genetic algorithms optimization and particle swarm opt...
Artificial Intelligence in Robot Path Planning
Genetic Approach to Parallel Scheduling
EVOLUTIONARY COMPUTING TECHNIQUES FOR SOFTWARE EFFORT ESTIMATION
Analysis of Parameter using Fuzzy Genetic Algorithm in E-learning System
Modified Bully Algorithm Incorporating the Concept of Election Commissio...
Genetic Algorithms
B017410916
IEEE 2014 JAVA DATA MINING PROJECTS A fast clustering based feature subset se...
40120130405011
Scheduling Courses Using Genetic Algorithms
Application of Genetic Algorithm and Particle Swarm Optimization in Software ...
An interactive approach to multiobjective clustering of gene expression patterns
Integrated bio-search approaches with multi-objective algorithms for optimiza...
A Novel Hybrid Voter Using Genetic Algorithm and Performance History
A fast clustering based feature subset selection algorithm for high-dimension...
SURVEY ON MODELLING METHODS APPLICABLE TO GENE REGULATORY NETWORK
Comparison between the genetic algorithms optimization and particle swarm opt...
Ad

Similar to Genetic algorithms mahyar (20)

PDF
CSA 3702 machine learning module 4
PDF
Ar03402580261
PDF
Advanced Optimization Techniques
PDF
Parallel evolutionary approach paper
PDF
Comparison
PDF
Machine learning
PPTX
Evolutionary computing - soft computing
PDF
COMPARATIVE STUDY OF DIFFERENT ALGORITHMS TO SOLVE N QUEENS PROBLEM
PDF
Multi-Population Methods with Adaptive Mutation for Multi-Modal Optimization ...
PPTX
Genetic Algorithm
PDF
Comparative study of different algorithms
PPTX
Genetic algorithms in Data Mining
PPTX
Clustering using GA and Hill-climbing
PDF
A MULTI-POPULATION BASED FROG-MEMETIC ALGORITHM FOR JOB SHOP SCHEDULING PROBLEM
PDF
F043046054
PDF
F043046054
PDF
F043046054
DOCX
introduction to machine learning unit iV
PPTX
PDF
ABOU-NAOUM_AMANE_ROUGUI_Article
CSA 3702 machine learning module 4
Ar03402580261
Advanced Optimization Techniques
Parallel evolutionary approach paper
Comparison
Machine learning
Evolutionary computing - soft computing
COMPARATIVE STUDY OF DIFFERENT ALGORITHMS TO SOLVE N QUEENS PROBLEM
Multi-Population Methods with Adaptive Mutation for Multi-Modal Optimization ...
Genetic Algorithm
Comparative study of different algorithms
Genetic algorithms in Data Mining
Clustering using GA and Hill-climbing
A MULTI-POPULATION BASED FROG-MEMETIC ALGORITHM FOR JOB SHOP SCHEDULING PROBLEM
F043046054
F043046054
F043046054
introduction to machine learning unit iV
ABOU-NAOUM_AMANE_ROUGUI_Article
Ad

Recently uploaded (20)

PPT
Introduction, IoT Design Methodology, Case Study on IoT System for Weather Mo...
PDF
Well-logging-methods_new................
PPTX
Current and future trends in Computer Vision.pptx
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PPTX
Construction Project Organization Group 2.pptx
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PDF
composite construction of structures.pdf
PPT
Mechanical Engineering MATERIALS Selection
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PPTX
Geodesy 1.pptx...............................................
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PPTX
bas. eng. economics group 4 presentation 1.pptx
PDF
PPT on Performance Review to get promotions
PPTX
Artificial Intelligence
PPTX
Safety Seminar civil to be ensured for safe working.
Introduction, IoT Design Methodology, Case Study on IoT System for Weather Mo...
Well-logging-methods_new................
Current and future trends in Computer Vision.pptx
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
Construction Project Organization Group 2.pptx
UNIT-1 - COAL BASED THERMAL POWER PLANTS
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
composite construction of structures.pdf
Mechanical Engineering MATERIALS Selection
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
Geodesy 1.pptx...............................................
CYBER-CRIMES AND SECURITY A guide to understanding
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
bas. eng. economics group 4 presentation 1.pptx
PPT on Performance Review to get promotions
Artificial Intelligence
Safety Seminar civil to be ensured for safe working.

Genetic algorithms mahyar

  • 1. Genetic algorithm with dynamic population for solving the simultaneous optimization of multiple query orders Mahyar Teymournezhad m_teimoornezhad@yahoo.com
  • 2. Abstract The purpose of optimizing several query orders is to find executive designs that minimize the total cost of executing queries using these schemes. Each query can have multiple designs individually. Since each project consists of a series of tasks, the MQO's goal is to find plans sharing the most tasks with other schemes. In the general case, this is one of the NP – Complete issues. So far, different methods have been proposed for this problem. In this paper, the problem of optimizing multiple query orders is solved using a dynamic population genetic algorithm. The results show that the proposed method has lower implementation time and more convergence speed than existing methods. Key Words : Multi-query optimization, Genetic Algorithm, Link Sequence
  • 3. I. INTRODUCTION One of the most important and costly parts of the database is the optimization of query orders. In Section 3, the modeling of the MQO problem using the genetic algorithm is being investigated. If multiple query orders are requested simultaneously from the database, obtaining an execution plan with minimal cost of executing these orders is a database optimizer task. Using the formulation [4], the second phase has been studied independently of the first phase. Usually, the second phase of MQO is the most time consuming phase in the problem. But to solve multiple query orders simultaneously and identify their shared tasks, it is necessary for the entire set of execution plans for these queries to run together, because costly design tasks may lead to more sharing with other queries and cause getting a better solution to the MQO problem.
  • 4. One of the most famous exploratory techniques used to optimize complex problems is the genetic algorithm. A definition of it is given in [8], the genetic algorithm is used to solve many of the NP – Complete issues. Goldberg has shown in [4] the practicality of the genetic algorithm by presenting a summary of applications of the genetic algorithm. The genetic algorithm simulates the evolutionary concepts of biology. This simulation involves probabilistic methods using evolutionary principles [1]. In the genetic algorithm, the original data structure is a vector of genes (called the chromosome). Each chromosome representing a sample is a solution to the problem. The chromosome members (called genes) are part of the problem solution. The quality of the solution sample (ie, a chromosome) is defined by being close to the optimal solution (which is called the fitness function). II. A review of the genetic algorithm
  • 5. The genetic algorithm searches for an optimal solution using evolutionary operators (also called genetic operators). Initially, with a randomly weak state, chromosomes are produced to display a variety of solutions. Then, genetic operators apply to weak chromosomes and produce new chromosomes for the next stage. II. A review of the genetic algorithm
  • 6. • The three operators used in the genetic algorithm are as follows: • Intersection operator: In the intersection operator, a part of the parent's chromosome changes to make the child's chromosome. • Mutation Operator: New chromosomes are produced by random modification of a small number of genes in the chromosome. The mutation operator will never apply to the best solution in the population. • Selection Operator: This operator determines which chromosomes should survive for the next generation. II. A review of the genetic algorithm
  • 7. • The simplest way can be to determine that better chromosomes have a greater chance of surviving in the next generation. This is what really happens in evolutionary processes. However, sometimes applying an intersection or mutation operator on an inappropriate chromosome may produce the appropriate chromosome [3]. II. A review of the genetic algorithm
  • 8. The most commonly used selection techniques are: • Cutting Method: First, all chromosomes are arranged in descending order (from the best to the worst) based on the number assigned to the fitness function. The n chromosomes above this list are then transmitted to the next generation with the same probability. • Race method: R is randomly selected. Then r chromosome is selected from the population and the chromosome with the best value (runtime) is transmitted to the next generation. This process continues until the proper amount has been reached for the next generation. In this method, a chromosome may be selected several times. • Fortune Wheel Method: In this method, the chromosomes are placed on the parts of the circle according to their fitness. The more chromosomes inside a part have a better fit, the greater the area. Then a random number is generated and the chromosomes of the part corresponding to that random number are transmitted to the next generation. Steinbrunn [9] has used this method. II. A review of the genetic algorithm
  • 9. • The genetic algorithm can easily model the MQO problem. Each chromosome is a solution to this problem. Each gene in the chromosomes represents a plan for the corresponding query order. • Each Ci chromosome is composed of a number of Gj genes. Each gene is a solution to a query order. In a generation, the number of chromosomes varies depending on the number of population associated with that Pk. • Selection Operator Σ: Takes the population of a generation and selects some of them to be transferred to the next generation. • Mutation Operator (M): It takes a chromosome as input and creates a new chromosome. III. Modeling of Genetic Algorithm for MQO
  • 10. To select a number of chromosomes for the next generation, the quality of the chromosomes (which is determined by the fitness function) is considered. A simple choice for the fitness function is to reverse the entire execution time of the query order task. Intersection and mutation operators can also be easily defined for the MQO problem. These operators create new and valid solutions. Since the genes of a chromosome represent a selective plan for the query order for that gene, replacing a plan with the current plan creates a new and valid solution. This is done by the mutation operator. For the intersection operator, different types can be considered: single- point, multi-point, and segment. In the proposed method, all these techniques create valid solutions. If two chromosomes represent two valid solutions for the MQO problem, each intersecting operator on these two chromosomes will create new and valid solutions. Regardless of the type of intersection operator and the location of doing it, since all the pieces that move in this operator represent valid solutions for the corresponding query instructions, the new solutions will also be valid. III. Modeling of Genetic Algorithm for MQO
  • 11. This paper presents a method in which the size of the population varies in each generation. So far, in all the classical methods, the population selection stage is considered constant. By doing this, the algorithm will be simpler, but an artificial limitation will be created and will not follow the natural genetic law in biology. Because the size of the population is constantly changing. One of the drawbacks of the revelation methods is that the algorithm stops at the local minimum and is also costly in computing. On the other hand, the phenomenon of congestion is also one of the destructive factors in the quality of the genetic algorithm [3]. When enough resources are available and fairly good solutions are available, the size of the population increases, and the size of the population decreases when the number of appropriate solutions in a generation is low. At first glance, the proposed method may seem not to be very effective, but experiments have shown that using this method improves the accuracy and speed of the implementation of the genetic algorithm. IV. Providing a solution offer: Genetic algorithm with dynamic population size (DP – GA)
  • 12. Although in some generations the algorithm may increase the population and thus increase the amount of computing, and cause the algorithm to slow down, the choice of an appropriate threshold value for how the population changes will reduce overall run-time. One of the most important parameters in this method is how to resize the population. To do so, first, the problem is solved by the greedy method. This is due to the speed of the implementation of the greedy algorithm. The resulting number is used by the greedy algorithm as a threshold value. In the next stage, considering the difference of the best answer in each generation with the answer given by the greedy algorithm, the population of the next generation is determined. IV. Providing a solution offer: Genetic algorithm with dynamic population size (DP – GA)
  • 13. In this paper, the population of the new generation is calculated as follows: In this formula is the next-gen population, is the current generation population, is the time calculated by the greedy algorithm and is the time of execution of the best solution in the current generation. IV. Providing a solution offer: Genetic algorithm with dynamic population size (DP – GA)
  • 14. In this section, the experimental results of comparing the genetic algorithm with constant population and proposed genetic algorithm with dynamic population are presented. Experiments are performed on computers with 2.2 GHz and 2G main memory. The language of the implementation of algorithms is C. V. The experimental results and analysis
  • 15. • To generate input, the parameters of Table 1 are used and are as follows: • At first tasks are generated randomly. To produce them, the two parameters of the number of tasks (T) and the lowest and the highest amount for the time of execution of tasks are used. Initially “T” tasks are produced (ie. “T” is the number of tasks). Then, for each generated task, a number is allocated in the MinET and MaxET intervals as runtime. After the tasks are created, they are distributed among the plans. For this purpose, the parameters MinP and MaxP are used. Although there may be many plans for using the tasks in share, it is avoided to create two exactly identical plans. Finally, query instructions are created. To generate them, the parameter of the number of query orders (Q), MinQ and MaxQ are used. V. The experimental results and analysis
  • 16. • Each query order has its own set of plans for execution. Therefore, a particular scheme does not solve more than one query command. At the same time, since each query order consists of a plan and each plan consists of tasks, these tasks can be shared between query orders. TABLE I: The values used to simulate the MQO problem with genetic algorithm V. The experimental results and analysis Parameter name Amount Primary population size 100 Iteration count (number of generations) 10 Mutation rate %1
  • 17. To calculate the input size, the average values (MaxP, MinP) and (MaxQ, MinQ) are considered, so the input size will be equal to: In fact, the input size is equal to the size of the MQO problem search space. This number does not correlate with the optimal solution value. Although a query order can have several executable plans, only one of these plans will appear for each query prompt in the final answer. Therefore, the optimal solution value is related only to the number of query orders, the number of tasks in each plan, and the execution time of the tasks. V. The experimental results and analysis
  • 18. To compare the exploratory algorithms that are presented to solve NP – Complete problems, there are usually two aspects to consider: The algorithm execution time and the difference in the solution obtained by the algorithm with the optimal response. Bayir in [1] explains nine different modes of genetic algorithm and has shown that for the problem of optimizing several query orders, the use of the shear selection operator and the initial population size operator is used only in the first execution of the algorithm. In future generations, this amount will be less or more depending on the best answer of every generation. The rate of mutation states that in each generation of the algorithm, several percent of the chromosomes of that generation will mutate. In the following, we use the phrase "time to execute the chromosome" as the time of execution of tasks that are given as the input of the MQO problem to the genetic algorithm. V. The experimental results and analysis
  • 19. The results are shown in Figures 1 and 2. The GA_DP algorithm shows more precision in finding the optimal answer. The reason for this is using a variable population that allows the algorithm to pass through the local minimum and search for more space. It is also shown in Figure 2 that the speed of the proposed algorithm is also increased. In cases where the execution time of the chromosomes of a generation has a significant difference with the optimal solution, the size of the population decreases. This decrease in population causes an increase in the speed of the algorithm and, as noted in Section 4, increases the convergence rate to the answer in these cases. V. The experimental results and analysis
  • 20. V. The experimental results and analysis 250 350 450 550 650 750 Figure 1. Response time chart
  • 21. V. The experimental results and analysis 250 350 450 550 650 750 Figure 2. Runtime graph
  • 22. VI. CONCLUSION In this paper, a solution based on a genetic algorithm with variable population was presented. One of the main drawbacks of the genetic algorithm is the timing of computing and getting caught up in local extremums. By the proposed method, although in some cases the amount of computing in one generation of the algorithm may be increased, overall, the amount of computations is reduced and, as a result, the algorithm's speed will be increased. Also, due to the population growth in the conditions mentioned in the article, this allows a proposed algorithm to pass through local minimums and has a faster convergence rate than the base genetic algorithm. The disadvantage of this method is that we need to set the initial parameters and the appropriate threshold for the rate of population change. The greedy method was used as threshold value in this paper. It seems that using random methods such as random walk and recovery iteration are more accurate estimations to determine this amount.
  • 23. REFERENCES [1] Murat Ali Bayir, Ismail H. Toroslu, and Ahmet Cosar, Genetic Algorithm for the Multiple-Query Optimization Problem, 2007, IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 37, NO. 1, JANUARY 2007 [2] Guido Moerkotte, Building Query Compilers, Page [375- 385], 2006 [3] Tom M. Mitchell, Machine Learning, page [250-270] 1997 [4] D. E. Goldberg, Genetic Algorithms in Search, Optimization and Machine Learning. Reading, MA: Addison-Wesley, 1989 [5] T. Sellis, “Multiple query optimization,” ACM Trans. Database Syst.,vol. 13, no. 1, pp. 23– 52, 1988. [6] A. Cosar, E. P. Lim, and J. Srivastava, “Multiple query optimization with depth-first branch-and-bound and dynamic query ordering,” in Proc.CIKM 93, 1993, pp. 433–438. [7] K. Shim, T. Sellis, and D. Nau, “Improvements on a heuristic algorithm for multiple-query optimization,” Data Knowl. Eng., vol. 12, no. 2, pp. 197– 222, 1994. [8] J. H. Holland, Adaptation in Natural and Artificial Systems. Ann Arbor,MI: Univ. Michigan Press, 1975. [9] Michael Steinbrunn, Guido Moerkotte, Alfons Kemper, Heuristic and randomized optimization for the join ordering problem,The VLDB Journal(1997)6:191–208