SlideShare a Scribd company logo
Dominance-Based Pareto-Surrogate for
Multi-Objective Optimization
Ilya Loshchilov1,2
, Marc Schoenauer1,2
, Michèle Sebag2,1
1
TAO Project-team, INRIA Saclay - Île-de-France
2
and Laboratoire de Recherche en Informatique (UMR CNRS 8623)
Université Paris-Sud, 91128 Orsay Cedex, France
Simulated Evolution And Learning (SEAL-2010)
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Dominance-Based Pareto-Surrogate for Multi-Objective Optimization 1/ 1
Multi-objective CMA-ES (MO-CMA-ES)
MO-CMA-ES = µmo independent (1+1)-CMA-ES.
Each (1+1)-CMA samples new offspring. The size of the
temporary population is 2µmo.
Only µmo best solutions should be chosen for new population
after the hypervolume-based non-dominated sorting.
Update of CMA individuals takes place.
Objective 1
Objective2
Dominated
Pareto
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Dominance-Based Pareto-Surrogate for Multi-Objective Optimization 2/ 1
Global Surrogate Model
Goal: find the function F(x) which defines the aggregated quality
of the solution x in multi-objective case.
Idea: use F(x) for optimization or filtering to find new prospective
solutions.
An efficient SVM-based approach has been recently proposed. 1
FSVM
Objective 1
Objective2
p-e
p+e
Dominated
Pareto
New Pareto
X1
X2
p+e
p-e
2e
1I. Loshchilov, M. Schoenauer, M. Sebag (GECCO 2010). "A Mono Surrogate for Multiobjective Optimization"
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Dominance-Based Pareto-Surrogate for Multi-Objective Optimization 3/ 1
SVM-informed EMOA: Filtering
Generate Ninform pre-children
For each pre-children A and the nearest parent B calculate
Gain(A, B) = Fsvm(A) − Fsvm(B)
New children is the point with the maximum value of Gain
X1
X2
true Pareto
SVM Pareto
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Dominance-Based Pareto-Surrogate for Multi-Objective Optimization 4/ 1
Support Vector Machine for Classification
Linear Classifier
L1
2
3
L
L
b
w
w
w2/|| ||
-1
b
b
+1
0
b
xi
support
vector
Main Idea
Training Data:
D = {(xi, yi)|xi ∈ IRp
, yi ∈ {−1, +1}}n
i=1
w, xi ≥ b + ǫ ⇒ yi = +1;
w, xi ≤ b − ǫ ⇒ yi = −1;
Dividing by ǫ > 0:
w, xi − b ≥ +1 ⇒ yi = +1;
w, xi − b ≤ −1 ⇒ yi = −1;
Optimization Problem: Primal Form
Minimize{w, ξ}
1
2 ||w||2
+ C
n
i=1 ξi
subject to: yi( w, xi − b) ≥ 1 − ξi, ξi ≥ 0
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Dominance-Based Pareto-Surrogate for Multi-Objective Optimization 5/ 1
Support Vector Machine for Classification
Linear Classifier
L1
2
3
L
L
b
w
w
w2/|| ||
-1
b
b
+1
0
b
xi
support
vector
Optimization Problem: Dual Form
From Lagrange Theorem, instead of minimize F:
Minimize{α,G}F − i αiGi
subject to: αi ≥ 0, Gi ≥ 0
Leaving the details, Dual form:
Maximize{α}
n
i αi − 1
2
n
i,j=1 αiαjyiyj xi, xj
subject to: 0 ≤ αi ≤ C, n
i αiyi = 0
Properties
Decision Function:
F(x) = sign(
n
i αiyi xi, x − b)
The Dual form may be solved using standard
quadratic programming solver.
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Dominance-Based Pareto-Surrogate for Multi-Objective Optimization 6/ 1
Support Vector Machine for Classification
Non-Linear Classifier
ww, (x)F -b =+1
w, (x)F -b = -1
w2/|| ||
a) b) c)
support vector
xi
F
Non-linear classification with the "Kernel trick"
Maximize{α}
n
i αi − 1
2
n
i,j=1 αiαjyiyjK(xi, xj)
subject to: ai ≥ 0,
n
i αiyi = 0,
where K(x, x′
) =def < Φ(x), Φ(x′
) > is the Kernel function
Decision Function: F(x) = sign(
n
i αiyiK(xi, x) − b)
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Dominance-Based Pareto-Surrogate for Multi-Objective Optimization 7/ 1
Support Vector Machine for Classification
Non-Linear Classifier: Kernels
Polynomial: k(xi, xj) = ( xi, xj + 1)d
Gaussian or Radial Basis Function: k(xi, xj) = exp(
xi−xj
2
2σ2 )
Hyperbolic tangent: k(xi, xj) = tanh(k xi, xj + c)
Examples for Polynomial (left) and Gaussian (right) Kernels:
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Dominance-Based Pareto-Surrogate for Multi-Objective Optimization 8/ 1
Ranking Support Vector Machine
Find F(x) which preserves the ordering of the training points.
w
L( 1r )
L( 2r )
xx
x
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Dominance-Based Pareto-Surrogate for Multi-Objective Optimization 9/ 1
Ranking Support Vector Machine
The simplified formulation with linear number of constraints (one per point) and 1 rank = 1 point
Primal problem
Minimize{w, ξ}
1
2 ||w||2
+
N
i=1 Ciξi
subject to
w, Φ(xi) − Φ(xi+1) ≥ 1 − ξi (i = 1 . . . N − 1)
ξi ≥ 0 (i = 1 . . . N − 1)
Dual problem
Maximize{α}
N−1
i αi −
N−1
i,j αijK(xi − xi+1, xj − xj+1))
subject to 0 ≤ αi ≤ Ci (i = 1 . . . N − 1)
Rank Surrogate Function
F(x) = N−1
i=1 αi(K(xi, x) − K(xi+1, x))
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Dominance-Based Pareto-Surrogate for Multi-Objective Optimization 10/ 1
Dominance-Based Surrogate
Rank Support Vector Machine
Goal: Find the function F(x) such that:
if xi ≻ xj then F(xi) > F(xj)
, where "≻" defines the Pareto-dominance relations.
F(x) is invariant to any "≻"-preserving transformation of
objective functions.
The hypervolume indicator of course is not invariant, at least in
the current formulation.
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Dominance-Based Pareto-Surrogate for Multi-Objective Optimization 11/ 1
Dominance-Based Surrogate
The complexity of the model: How to choose the constraints?
Learn all possible ≻ relations may be too expensive.
Learn only Primary constraints to build a basic model is the
reasonable choice.
Additionally learn small number of the most violated Secondary
constraints - the way to make the model smoother.
Objective 1
Objective2
FSVM
Primary
Secondary
- constraints:“>”
a
b
c
d
e
f
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Dominance-Based Pareto-Surrogate for Multi-Objective Optimization 12/ 1
Dominance-Based Surrogate
Primary and Secondary constraints
Primary dominance constraints are associated to pairs (xi, xj)
such that xj is the nearest neighbor of xi (in objective space)
conditionally to the fact that xi dominates xj.
Secondary dominance constraints are associated to pairs
(xi, xj) such that xi belongs to the current Pareto front and xj
belongs to another non-dominated front.
Construction of the surrogate model
Initialize archive Ωactive as the set of Primary constraints, and
Ωpassive as the set of Secondary constraints.
Optimize the model for 1000 |Ωactive| iterations.
Add the most violated passive contraint from Ωpassive to Ωactive
and optimize the model for 10 |Ωactive| iterations.
Repeat the last step 0.1|Ωactive| times.
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Dominance-Based Pareto-Surrogate for Multi-Objective Optimization 13/ 1
Experimental Validation
Parameters
Surrogate Models:
ASM - aggregated surrogate model based on One-Class SVM
and Regression SVM 2
RASM - proposed Rank-based SVM
SVM Learning:
Number of training points: at most Ntraining = 1000 points
Number of iterations: 1000 |Ωactive| + |Ωactive|
2
≈ 2N2
training
Kernel function: RBF function with σ equal to the average
distance of the training points
The cost of constraint violation: C = 1000
Offspring Selection Procedure:
Number of pre-children: p = 2 and p = 10
2I. Loshchilov, M. Schoenauer, M. Sebag (GECCO 2010). "A Mono Surrogate for Multiobjective Optimization"
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Dominance-Based Pareto-Surrogate for Multi-Objective Optimization 14/ 1
Experimental Validation
Results
Table 1. Comparative results of two baseline EMOAs, namely S-NSGA-II and MO-
CMA-ES and their ASM and RASM variants. Median number of function evaluations
(out of 10 independent runs) to reach ∆Htarget values, normalized by Best: a value of
1 indicates the best result, a value X > 1 indicates that the corresponding algorithm
needed X times more evaluations than the best to reach the same precision.
∆Htarget 1 0.1 0.01 1e-3 1e-4 1 0.1 0.01 1e-3 1e-4
ZDT1 ZDT2
Best 1100 3000 5300 7800 38800 1400 4200 6600 8500 32700
S-NSGA-II 1.6 2 2 2.3 1.1 1.8 1.7 1.8 2.3 1.2
ASM-NSGA p=2 1.2 1.5 1.4 1.5 1.5 1.2 1.2 1.2 1.4 1
ASM-NSGA p=10 1 1 1 1 . 1 1 1 1 .
RASM-NSGA p=2 1.2 1.4 1.4 1.6 1 1.3 1.2 1.2 1.5 1
RASM-NSGA p=10 1 1.1 1.1 1.5 . 1.1 1 1 1.2 .
MO-CMA-ES 16.5 14.4 12.3 11.3 . 14.7 10.7 10 10.1 .
ASM-MO-CMA p=2 6.8 8.5 8.3 8 . 5.9 8.2 7.7 7.5 .
ASM-MO-CMA p=10 6.9 10.1 10.4 12.1 . 5 . . . .
RASM-MO-CMA p=2 5.1 7.7 7.6 7.4 . 5.2 . . . .
RASM-MO-CMA p=10 3.6 4.3 4.9 7.2 . 3.2 . . . .
IHR1 IHR2
Best 500 2000 35300 41200 50300 1700 7000 12900 52900 .
S-NSGA-II 1.6 1.5 . . . 1.1 3.2 6.2 . .
ASM-NSGA p=2 1.2 1.3 . . . 1 3.9 4.9 . .
ASM-NSGA p=10 1 1.5 . . . 1.4 6.4 4.6 . .
RASM-NSGA p=2 1.2 1.2 . . . 1.5 . . . .
RASM-NSGA p=10 1 1 . . . 1.2 5.1 4.8 . .
MO-CMA-ES 8.2 6.5 1.1 1.2 1.2 5.8 2.7 2.1 1 .
ASM-MO-CMA p=2 4.6 2.9 1 1 1 3.1 1.6 1.4 1.1 .
ASM-MO-CMA p=10 9.2 6.1 1.3 1.2 . 5.9 2.6 2.4 . .
RASM-MO-CMA p=2 2.6 2.3 2.4 2.1 . 2.2 1 1 . .
RASM-MO-CMA p=10 1.8 1.9 . . . . . . . .
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Dominance-Based Pareto-Surrogate for Multi-Objective Optimization 15/ 1
Experimental Validation
Comparison of original and SVM-informed versions of NSGA-II and
MO-CMA-ES on ZDT and IHR problems shows:
SVM-informed versions are 1.5 times faster for p = 2 and 2-5 for
p = 10 before the algorithm can find nearly-optimal Pareto points.
The premature convergence of approximation of optimal
µ-distribution is observed, because the global surrogate model
deals only with the convergence, but not with the diversity.
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Dominance-Based Pareto-Surrogate for Multi-Objective Optimization 16/ 1
Summary
The proposed aggregated surrogate model is invariant to ≻
preserving transformation of the objective functions.
The speed-up is significant, but limited to the convergence to the
optimal Pareto front.
The model can incorporate "any" kind of preferences:
extreme points, "=" relations, Hypervolume Contribution,
Decision Maker - defined ≻ relations.
Objective 1
Objective2
a
b
c
d
e
f
g
FSVM
Primary
Secondary
- constraints:“>”
Primary
Secondary
- constraints:“=”
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Dominance-Based Pareto-Surrogate for Multi-Objective Optimization 17/ 1
Thank you for your attention!
Questions?
Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Dominance-Based Pareto-Surrogate for Multi-Objective Optimization 18/ 1

More Related Content

PDF
Gary Yen: "Multi-objective Optimization and Performance Metrics Ensemble"
PPTX
Multi-Objective Evolutionary Algorithms
PDF
Multi objective optimization and Benchmark functions result
PPTX
PDF
Additive model and boosting tree
PDF
오토인코더의 모든 것
PPT
Lp and ip programming cp 9
PDF
Distributed ADMM
Gary Yen: "Multi-objective Optimization and Performance Metrics Ensemble"
Multi-Objective Evolutionary Algorithms
Multi objective optimization and Benchmark functions result
Additive model and boosting tree
오토인코더의 모든 것
Lp and ip programming cp 9
Distributed ADMM

What's hot (20)

PPTX
Rabbit challenge 3 DNN Day2
KEY
Regret-Based Reward Elicitation for Markov Decision Processes
PDF
"Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof...
PDF
Generalized Reinforcement Learning
PPTX
Nonlinear programming 2013
PDF
Conditional neural processes
PPTX
Global optimization
PPTX
Dynamic Programming
PDF
Output Units and Cost Function in FNN
PDF
PDF
D I G I T A L C O N T R O L S Y S T E M S J N T U M O D E L P A P E R{Www
PDF
Digital signal and image processing FAQ
PPTX
The world of loss function
PDF
Gaussian Processes: Applications in Machine Learning
PDF
Introduction to Deep Neural Network
PDF
Metaheuristic Tuning of Type-II Fuzzy Inference System for Data Mining
PDF
Kernels and Support Vector Machines
PDF
Connection between inverse problems and uncertainty quantification problems
PPTX
Daa:Dynamic Programing
Rabbit challenge 3 DNN Day2
Regret-Based Reward Elicitation for Markov Decision Processes
"Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof...
Generalized Reinforcement Learning
Nonlinear programming 2013
Conditional neural processes
Global optimization
Dynamic Programming
Output Units and Cost Function in FNN
D I G I T A L C O N T R O L S Y S T E M S J N T U M O D E L P A P E R{Www
Digital signal and image processing FAQ
The world of loss function
Gaussian Processes: Applications in Machine Learning
Introduction to Deep Neural Network
Metaheuristic Tuning of Type-II Fuzzy Inference System for Data Mining
Kernels and Support Vector Machines
Connection between inverse problems and uncertainty quantification problems
Daa:Dynamic Programing
Ad

Viewers also liked (19)

PDF
Multi-Objective Optimization in Rule-based Design Space Exploration (ASE 2014)
PDF
Multiobjective optimization and trade offs using pareto optimality
POT
Multi Objective Optimization
PPTX
Evolutionary Search Techniques with Strong Heuristics for Multi-Objective Fea...
PDF
A Pareto-Compliant Surrogate Approach for Multiobjective Optimization
PPT
Visualization of pareto front for multi objective optimization
PPTX
Optimisation and Tradeoff Analysis Tool for Asset Management Planning and Pro...
PPTX
Multi-objective Optimisation of a Water Distribution Network with a Sequence-...
PDF
Multi-Objective Optimization Algorithms for Finite Element Model Updating. Nt...
PPTX
Complexity in Ambiguous Problem Solution Search: Group Dynamics, Search Tac...
PPT
Prof Graeme Dandy at the Landscape Science Cluster Seminar, May 2009
PPTX
Lecture 29 genetic algorithm-example
PPTX
Genetic algorithm raktim
PPTX
Fuzzy Genetic Algorithm
PPT
Welfare economics
PPTX
Genetic Algorithm by Example
PPT
Genetic algorithm
PDF
Method of solving multi objective optimization
PPSX
Welfare economics
Multi-Objective Optimization in Rule-based Design Space Exploration (ASE 2014)
Multiobjective optimization and trade offs using pareto optimality
Multi Objective Optimization
Evolutionary Search Techniques with Strong Heuristics for Multi-Objective Fea...
A Pareto-Compliant Surrogate Approach for Multiobjective Optimization
Visualization of pareto front for multi objective optimization
Optimisation and Tradeoff Analysis Tool for Asset Management Planning and Pro...
Multi-objective Optimisation of a Water Distribution Network with a Sequence-...
Multi-Objective Optimization Algorithms for Finite Element Model Updating. Nt...
Complexity in Ambiguous Problem Solution Search: Group Dynamics, Search Tac...
Prof Graeme Dandy at the Landscape Science Cluster Seminar, May 2009
Lecture 29 genetic algorithm-example
Genetic algorithm raktim
Fuzzy Genetic Algorithm
Welfare economics
Genetic Algorithm by Example
Genetic algorithm
Method of solving multi objective optimization
Welfare economics
Ad

Similar to Dominance-Based Pareto-Surrogate for Multi-Objective Optimization (20)

PPTX
Lec2-review-III-svm-logreg_for the beginner.pptx
PPTX
Lec2-review-III-svm-logregressionmodel.pptx
PDF
Data-Driven Recommender Systems
PDF
Chap 8. Optimization for training deep models
PDF
MLHEP 2015: Introductory Lecture #2
PDF
MLHEP Lectures - day 2, basic track
PDF
I stata
PDF
New Surrogate-Assisted Search Control and Restart Strategies for CMA-ES
PDF
Accelerated Particle Swarm Optimization and Support Vector Machine for Busine...
PDF
Lecture7 cross validation
PDF
机器学习Adaboost
PPTX
A hybrid sine cosine optimization algorithm for solving global optimization p...
PDF
A Multi-Objective Genetic Algorithm for Pruning Support Vector Machines
PPT
Simplex Method for Linear Programming - Operations Research
 
PDF
Julia Kreutzer - 2017 - Bandit Structured Prediction for Neural Seq2Seq Learning
PDF
Backpropagation - Elisa Sayrol - UPC Barcelona 2018
PDF
deeplearninhg........ applicationsWEEK 05.pdf
PDF
A seriously simple memetic approach with a high performance
PDF
Sparsenet
PDF
Gradient Boosted Regression Trees in Scikit Learn by Gilles Louppe & Peter Pr...
Lec2-review-III-svm-logreg_for the beginner.pptx
Lec2-review-III-svm-logregressionmodel.pptx
Data-Driven Recommender Systems
Chap 8. Optimization for training deep models
MLHEP 2015: Introductory Lecture #2
MLHEP Lectures - day 2, basic track
I stata
New Surrogate-Assisted Search Control and Restart Strategies for CMA-ES
Accelerated Particle Swarm Optimization and Support Vector Machine for Busine...
Lecture7 cross validation
机器学习Adaboost
A hybrid sine cosine optimization algorithm for solving global optimization p...
A Multi-Objective Genetic Algorithm for Pruning Support Vector Machines
Simplex Method for Linear Programming - Operations Research
 
Julia Kreutzer - 2017 - Bandit Structured Prediction for Neural Seq2Seq Learning
Backpropagation - Elisa Sayrol - UPC Barcelona 2018
deeplearninhg........ applicationsWEEK 05.pdf
A seriously simple memetic approach with a high performance
Sparsenet
Gradient Boosted Regression Trees in Scikit Learn by Gilles Louppe & Peter Pr...

Recently uploaded (20)

PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Review of recent advances in non-invasive hemoglobin estimation
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
A Presentation on Artificial Intelligence
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
Programs and apps: productivity, graphics, security and other tools
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Approach and Philosophy of On baking technology
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Encapsulation theory and applications.pdf
PPTX
sap open course for s4hana steps from ECC to s4
PDF
Spectral efficient network and resource selection model in 5G networks
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Review of recent advances in non-invasive hemoglobin estimation
“AI and Expert System Decision Support & Business Intelligence Systems”
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
A Presentation on Artificial Intelligence
The Rise and Fall of 3GPP – Time for a Sabbatical?
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Diabetes mellitus diagnosis method based random forest with bat algorithm
Programs and apps: productivity, graphics, security and other tools
MYSQL Presentation for SQL database connectivity
Assigned Numbers - 2025 - Bluetooth® Document
MIND Revenue Release Quarter 2 2025 Press Release
Digital-Transformation-Roadmap-for-Companies.pptx
Approach and Philosophy of On baking technology
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Per capita expenditure prediction using model stacking based on satellite ima...
Encapsulation theory and applications.pdf
sap open course for s4hana steps from ECC to s4
Spectral efficient network and resource selection model in 5G networks

Dominance-Based Pareto-Surrogate for Multi-Objective Optimization

  • 1. Dominance-Based Pareto-Surrogate for Multi-Objective Optimization Ilya Loshchilov1,2 , Marc Schoenauer1,2 , Michèle Sebag2,1 1 TAO Project-team, INRIA Saclay - Île-de-France 2 and Laboratoire de Recherche en Informatique (UMR CNRS 8623) Université Paris-Sud, 91128 Orsay Cedex, France Simulated Evolution And Learning (SEAL-2010) Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Dominance-Based Pareto-Surrogate for Multi-Objective Optimization 1/ 1
  • 2. Multi-objective CMA-ES (MO-CMA-ES) MO-CMA-ES = µmo independent (1+1)-CMA-ES. Each (1+1)-CMA samples new offspring. The size of the temporary population is 2µmo. Only µmo best solutions should be chosen for new population after the hypervolume-based non-dominated sorting. Update of CMA individuals takes place. Objective 1 Objective2 Dominated Pareto Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Dominance-Based Pareto-Surrogate for Multi-Objective Optimization 2/ 1
  • 3. Global Surrogate Model Goal: find the function F(x) which defines the aggregated quality of the solution x in multi-objective case. Idea: use F(x) for optimization or filtering to find new prospective solutions. An efficient SVM-based approach has been recently proposed. 1 FSVM Objective 1 Objective2 p-e p+e Dominated Pareto New Pareto X1 X2 p+e p-e 2e 1I. Loshchilov, M. Schoenauer, M. Sebag (GECCO 2010). "A Mono Surrogate for Multiobjective Optimization" Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Dominance-Based Pareto-Surrogate for Multi-Objective Optimization 3/ 1
  • 4. SVM-informed EMOA: Filtering Generate Ninform pre-children For each pre-children A and the nearest parent B calculate Gain(A, B) = Fsvm(A) − Fsvm(B) New children is the point with the maximum value of Gain X1 X2 true Pareto SVM Pareto Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Dominance-Based Pareto-Surrogate for Multi-Objective Optimization 4/ 1
  • 5. Support Vector Machine for Classification Linear Classifier L1 2 3 L L b w w w2/|| || -1 b b +1 0 b xi support vector Main Idea Training Data: D = {(xi, yi)|xi ∈ IRp , yi ∈ {−1, +1}}n i=1 w, xi ≥ b + ǫ ⇒ yi = +1; w, xi ≤ b − ǫ ⇒ yi = −1; Dividing by ǫ > 0: w, xi − b ≥ +1 ⇒ yi = +1; w, xi − b ≤ −1 ⇒ yi = −1; Optimization Problem: Primal Form Minimize{w, ξ} 1 2 ||w||2 + C n i=1 ξi subject to: yi( w, xi − b) ≥ 1 − ξi, ξi ≥ 0 Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Dominance-Based Pareto-Surrogate for Multi-Objective Optimization 5/ 1
  • 6. Support Vector Machine for Classification Linear Classifier L1 2 3 L L b w w w2/|| || -1 b b +1 0 b xi support vector Optimization Problem: Dual Form From Lagrange Theorem, instead of minimize F: Minimize{α,G}F − i αiGi subject to: αi ≥ 0, Gi ≥ 0 Leaving the details, Dual form: Maximize{α} n i αi − 1 2 n i,j=1 αiαjyiyj xi, xj subject to: 0 ≤ αi ≤ C, n i αiyi = 0 Properties Decision Function: F(x) = sign( n i αiyi xi, x − b) The Dual form may be solved using standard quadratic programming solver. Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Dominance-Based Pareto-Surrogate for Multi-Objective Optimization 6/ 1
  • 7. Support Vector Machine for Classification Non-Linear Classifier ww, (x)F -b =+1 w, (x)F -b = -1 w2/|| || a) b) c) support vector xi F Non-linear classification with the "Kernel trick" Maximize{α} n i αi − 1 2 n i,j=1 αiαjyiyjK(xi, xj) subject to: ai ≥ 0, n i αiyi = 0, where K(x, x′ ) =def < Φ(x), Φ(x′ ) > is the Kernel function Decision Function: F(x) = sign( n i αiyiK(xi, x) − b) Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Dominance-Based Pareto-Surrogate for Multi-Objective Optimization 7/ 1
  • 8. Support Vector Machine for Classification Non-Linear Classifier: Kernels Polynomial: k(xi, xj) = ( xi, xj + 1)d Gaussian or Radial Basis Function: k(xi, xj) = exp( xi−xj 2 2σ2 ) Hyperbolic tangent: k(xi, xj) = tanh(k xi, xj + c) Examples for Polynomial (left) and Gaussian (right) Kernels: Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Dominance-Based Pareto-Surrogate for Multi-Objective Optimization 8/ 1
  • 9. Ranking Support Vector Machine Find F(x) which preserves the ordering of the training points. w L( 1r ) L( 2r ) xx x Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Dominance-Based Pareto-Surrogate for Multi-Objective Optimization 9/ 1
  • 10. Ranking Support Vector Machine The simplified formulation with linear number of constraints (one per point) and 1 rank = 1 point Primal problem Minimize{w, ξ} 1 2 ||w||2 + N i=1 Ciξi subject to w, Φ(xi) − Φ(xi+1) ≥ 1 − ξi (i = 1 . . . N − 1) ξi ≥ 0 (i = 1 . . . N − 1) Dual problem Maximize{α} N−1 i αi − N−1 i,j αijK(xi − xi+1, xj − xj+1)) subject to 0 ≤ αi ≤ Ci (i = 1 . . . N − 1) Rank Surrogate Function F(x) = N−1 i=1 αi(K(xi, x) − K(xi+1, x)) Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Dominance-Based Pareto-Surrogate for Multi-Objective Optimization 10/ 1
  • 11. Dominance-Based Surrogate Rank Support Vector Machine Goal: Find the function F(x) such that: if xi ≻ xj then F(xi) > F(xj) , where "≻" defines the Pareto-dominance relations. F(x) is invariant to any "≻"-preserving transformation of objective functions. The hypervolume indicator of course is not invariant, at least in the current formulation. Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Dominance-Based Pareto-Surrogate for Multi-Objective Optimization 11/ 1
  • 12. Dominance-Based Surrogate The complexity of the model: How to choose the constraints? Learn all possible ≻ relations may be too expensive. Learn only Primary constraints to build a basic model is the reasonable choice. Additionally learn small number of the most violated Secondary constraints - the way to make the model smoother. Objective 1 Objective2 FSVM Primary Secondary - constraints:“>” a b c d e f Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Dominance-Based Pareto-Surrogate for Multi-Objective Optimization 12/ 1
  • 13. Dominance-Based Surrogate Primary and Secondary constraints Primary dominance constraints are associated to pairs (xi, xj) such that xj is the nearest neighbor of xi (in objective space) conditionally to the fact that xi dominates xj. Secondary dominance constraints are associated to pairs (xi, xj) such that xi belongs to the current Pareto front and xj belongs to another non-dominated front. Construction of the surrogate model Initialize archive Ωactive as the set of Primary constraints, and Ωpassive as the set of Secondary constraints. Optimize the model for 1000 |Ωactive| iterations. Add the most violated passive contraint from Ωpassive to Ωactive and optimize the model for 10 |Ωactive| iterations. Repeat the last step 0.1|Ωactive| times. Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Dominance-Based Pareto-Surrogate for Multi-Objective Optimization 13/ 1
  • 14. Experimental Validation Parameters Surrogate Models: ASM - aggregated surrogate model based on One-Class SVM and Regression SVM 2 RASM - proposed Rank-based SVM SVM Learning: Number of training points: at most Ntraining = 1000 points Number of iterations: 1000 |Ωactive| + |Ωactive| 2 ≈ 2N2 training Kernel function: RBF function with σ equal to the average distance of the training points The cost of constraint violation: C = 1000 Offspring Selection Procedure: Number of pre-children: p = 2 and p = 10 2I. Loshchilov, M. Schoenauer, M. Sebag (GECCO 2010). "A Mono Surrogate for Multiobjective Optimization" Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Dominance-Based Pareto-Surrogate for Multi-Objective Optimization 14/ 1
  • 15. Experimental Validation Results Table 1. Comparative results of two baseline EMOAs, namely S-NSGA-II and MO- CMA-ES and their ASM and RASM variants. Median number of function evaluations (out of 10 independent runs) to reach ∆Htarget values, normalized by Best: a value of 1 indicates the best result, a value X > 1 indicates that the corresponding algorithm needed X times more evaluations than the best to reach the same precision. ∆Htarget 1 0.1 0.01 1e-3 1e-4 1 0.1 0.01 1e-3 1e-4 ZDT1 ZDT2 Best 1100 3000 5300 7800 38800 1400 4200 6600 8500 32700 S-NSGA-II 1.6 2 2 2.3 1.1 1.8 1.7 1.8 2.3 1.2 ASM-NSGA p=2 1.2 1.5 1.4 1.5 1.5 1.2 1.2 1.2 1.4 1 ASM-NSGA p=10 1 1 1 1 . 1 1 1 1 . RASM-NSGA p=2 1.2 1.4 1.4 1.6 1 1.3 1.2 1.2 1.5 1 RASM-NSGA p=10 1 1.1 1.1 1.5 . 1.1 1 1 1.2 . MO-CMA-ES 16.5 14.4 12.3 11.3 . 14.7 10.7 10 10.1 . ASM-MO-CMA p=2 6.8 8.5 8.3 8 . 5.9 8.2 7.7 7.5 . ASM-MO-CMA p=10 6.9 10.1 10.4 12.1 . 5 . . . . RASM-MO-CMA p=2 5.1 7.7 7.6 7.4 . 5.2 . . . . RASM-MO-CMA p=10 3.6 4.3 4.9 7.2 . 3.2 . . . . IHR1 IHR2 Best 500 2000 35300 41200 50300 1700 7000 12900 52900 . S-NSGA-II 1.6 1.5 . . . 1.1 3.2 6.2 . . ASM-NSGA p=2 1.2 1.3 . . . 1 3.9 4.9 . . ASM-NSGA p=10 1 1.5 . . . 1.4 6.4 4.6 . . RASM-NSGA p=2 1.2 1.2 . . . 1.5 . . . . RASM-NSGA p=10 1 1 . . . 1.2 5.1 4.8 . . MO-CMA-ES 8.2 6.5 1.1 1.2 1.2 5.8 2.7 2.1 1 . ASM-MO-CMA p=2 4.6 2.9 1 1 1 3.1 1.6 1.4 1.1 . ASM-MO-CMA p=10 9.2 6.1 1.3 1.2 . 5.9 2.6 2.4 . . RASM-MO-CMA p=2 2.6 2.3 2.4 2.1 . 2.2 1 1 . . RASM-MO-CMA p=10 1.8 1.9 . . . . . . . . Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Dominance-Based Pareto-Surrogate for Multi-Objective Optimization 15/ 1
  • 16. Experimental Validation Comparison of original and SVM-informed versions of NSGA-II and MO-CMA-ES on ZDT and IHR problems shows: SVM-informed versions are 1.5 times faster for p = 2 and 2-5 for p = 10 before the algorithm can find nearly-optimal Pareto points. The premature convergence of approximation of optimal µ-distribution is observed, because the global surrogate model deals only with the convergence, but not with the diversity. Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Dominance-Based Pareto-Surrogate for Multi-Objective Optimization 16/ 1
  • 17. Summary The proposed aggregated surrogate model is invariant to ≻ preserving transformation of the objective functions. The speed-up is significant, but limited to the convergence to the optimal Pareto front. The model can incorporate "any" kind of preferences: extreme points, "=" relations, Hypervolume Contribution, Decision Maker - defined ≻ relations. Objective 1 Objective2 a b c d e f g FSVM Primary Secondary - constraints:“>” Primary Secondary - constraints:“=” Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Dominance-Based Pareto-Surrogate for Multi-Objective Optimization 17/ 1
  • 18. Thank you for your attention! Questions? Ilya Loshchilov, Marc Schoenauer, Michèle Sebag Dominance-Based Pareto-Surrogate for Multi-Objective Optimization 18/ 1