SlideShare a Scribd company logo
12
Most read
18
Most read
20
Most read
CVRP solver
with
Multi-Head Attention
Rintaro Sato
Kyushu University, Japan
0
8
42
3
0
5
7
6
1
8
42
3
0
5
7
6
1
Input:
・each customer location, demand
・vehicle capacity
・depot location
Output:
・tour with minimum cost (=distance)
What do we want?
1
Embedding
Encoder
MHA
layer
Decoder
ℎ𝑖 ℎ𝑖
(𝑁)
Select
Next node
Generate
Context Vector ℎ 𝐶
𝑁
MHA
layer
Training Overview
Evaluate Cost & Update Encoder, Decoder
2
Embedding
Encoder
MHA
layer
Decoder
ℎ𝑖 ℎ𝑖
(𝑁)
Select
Next node
Generate
Context Vector ℎ 𝐶
𝑁
MHA
layer
Training Overview
Evaluate Cost & Update Encoder, Decoder
3
8
42
3
0
5
7
6
1
ℝ3
vector 𝑥𝑖 = (x, y, demand)𝑖
ℝ 𝑑ℎ
𝑥1
𝑥7
𝑊∗𝑥𝑖 + 𝑏 = ℎ𝑖
ℎ1
𝑒. 𝑔. 𝑑ℎ = 128
ℎ7
𝑚𝑎𝑝𝑝𝑖𝑛𝑔 𝑖𝑛 𝑑ℎ 𝑑𝑖𝑚𝑒𝑛𝑡𝑖𝑜𝑛𝑎𝑙 𝑣𝑒𝑐𝑡𝑜𝑟 𝑠𝑝𝑎𝑐𝑒
embedding vector ℎ𝑖
Input Feature
ℝ 𝑑ℎ
Initial Embedding in Encoder
4
Embedding
Encoder
MHA
layer
Decoder
ℎ𝑖 ℎ𝑖
(𝑁)
Select
Next node
Generate
Context Vector ℎ 𝐶
𝑁
MHA
layer
Training Overview
Evaluate Cost & Update Encoder, Decoder
5
Embedding
Encoder
MHA
layer
Decoder
ℎ𝑖 ℎ𝑖
(𝑁)
Select
Next node
Generate
Context Vector ℎ 𝐶
𝑁
MHA
layer
Training Overview
Evaluate Cost & Update Encoder, Decoder
6
Multi-Head Attention layer (= MHA layer) [2]
7
Embedding
Encoder
MHA
layer
Decoder
ℎ𝑖 ℎ𝑖
(𝑁)
Select
Next node
Generate
Context Vector ℎ 𝐶
𝑁
MHA
layer
Training Overview
Evaluate Cost & Update Encoder, Decoder
8
Embedding
Encoder
MHA
layer
Decoder
ℎ𝑖 ℎ𝑖
(𝑁)
Select
Next node
Generate
Context Vector ℎ 𝐶
𝑁
MHA
layer
Training Overview
Evaluate Cost & Update Encoder, Decoder
9
Embedding
Encoder
MHA
layer
Decoder
ℎ𝑖 ℎ𝑖
(𝑁)
Select
Next node
Generate
Context Vector ℎ 𝐶
𝑁
MHA
layer
Training Overview
Evaluate Cost & Update Encoder, Decoder
10
Encoder Decoder
ℎ 𝑁
Select
Next node
Generate
Context Vector ℎ 𝐶
𝑁
How Decoder works
Context vector ℎ 𝐶
𝑁
: contains state information;
ℎ 𝐶
𝑁
= [ℎ 𝑁
, ℎ 𝜋 𝑡−1
𝑁
, 𝐷𝑡]
ℎ 𝑁
: graph embedding( = output of Encoder)
ℎ 𝜋 𝑡−1
𝑁 : last visited node embedding
𝐷𝑡 : remaining vehicle capacity
11
Encoder Decoder
Select
Next node
Generate
Context Vector ℎ 𝐶
𝑁
MHA
layer
How Decoder works
12
Encoder Decoder
Select
Next node
MHA
layer
How Decoder works
Probability 𝑝 𝜃
13
Update
𝐷𝑡 , ℎ 𝜋 𝑡−1
𝑁
Decoder
Select
Next node
Generate
Context Vector ℎ 𝐶
𝑁
MHA
layer
Training Overview
14
Encoder
While loop
until all routes are completed
Embedding
Encoder
MHA
layer
Decoder
ℎ𝑖 ℎ𝑖
(𝑁)
Select
Next node
Generate
Context Vector ℎ 𝐶
𝑁
MHA
layer
Training Overview
Evaluate Cost & Update Encoder, Decoder
15
𝑔𝑟𝑎𝑑𝑖𝑒𝑛𝑡 ∇ 𝜃 𝐽 𝜃 s ≅ E[ 𝐿 𝜋 s − b s ∙ ∇ 𝜃 𝑙𝑜𝑔𝑝 𝜃 𝜋 s ]
𝑙𝑜𝑔𝑝 = L𝑜𝑔𝑆𝑜𝑓𝑡𝑚𝑎𝑥 = log
exp 𝑥𝑖
exp 𝑥𝑖
= 𝑖𝑛 𝑟𝑎𝑛𝑔𝑒 −𝐼𝑛𝑓, 0
REINFORCE(Williams, 1992), policy gradient
𝐿 𝜋 s : length of path
𝜋: path (index permutation)
𝑠: graph
b s : baseline
∇ 𝜃: gradient by 𝜃
𝑝 𝜃: probability
𝑚𝑖𝑛𝑖𝑚𝑖𝑧𝑒 𝐿𝑜𝑠𝑠 𝐹𝑢𝑛𝑐𝑡𝑖𝑜𝑛 𝐽 𝜃 s = 𝐸[𝐿 𝜋 s ]
𝑙𝑜𝑔𝑝 𝜃 𝜋 s = log(
𝑖=1
𝑛
𝑝 𝜋 𝑖 𝜋 < 𝑖 , 𝑠)) =
𝑖=1
𝑛
log(𝑝 𝜋 𝑖 𝜋 < 𝑖 , 𝑠)), 𝑛𝑜𝑑𝑒𝑠 𝑖 ∈ 0, 1, … , 𝑛
Approximation of gradient using REINFORCE
∇ 𝜃 𝐽 𝜃 s → 0
16
𝑔𝑟𝑎𝑑𝑖𝑒𝑛𝑡 ∇ 𝜃 𝐽 𝜃 s ≅ E[ 𝐿 𝜋 s − b s ∙ ∇ 𝜃 𝑙𝑜𝑔𝑝 𝜃 𝜋 s ]
𝑙𝑜𝑔𝑝 = L𝑜𝑔𝑆𝑜𝑓𝑡𝑚𝑎𝑥 = log
exp 𝑥𝑖
exp 𝑥𝑖
= 𝑖𝑛 𝑟𝑎𝑛𝑔𝑒 −𝐼𝑛𝑓, 0
REINFORCE(Williams, 1992), policy gradient
𝐿 𝜋 s : length of path
𝜋: path (index permutation)
𝑠: graph
b s : baseline
∇ 𝜃: gradient by 𝜃
𝑝 𝜃: probability
𝑚𝑖𝑛𝑖𝑚𝑖𝑧𝑒 𝐿𝑜𝑠𝑠 𝐹𝑢𝑛𝑐𝑡𝑖𝑜𝑛 𝐽 𝜃 s = 𝐸[𝐿 𝜋 s ]
𝑙𝑜𝑔𝑝 𝜃 𝜋 s = log(
𝑖=1
𝑛
𝑝 𝜋 𝑖 𝜋 < 𝑖 , 𝑠)) =
𝑖=1
𝑛
log(𝑝 𝜋 𝑖 𝜋 < 𝑖 , 𝑠)), 𝑛𝑜𝑑𝑒𝑠 𝑖 ∈ 0, 1, … , 𝑛
Approximation of gradient using REINFORCE
In order to reduce variance
Update every epoch if the current model perform well
17
∇ 𝜃 𝐽 𝜃 s → 0
18
Results
Reference
Paper
・[1] Attention, Learn to Solve Routing Problems (Wouter Kool et al. 2019)
・[2] Attention is all you need (Vaswani et al. 2017)
Article
・https://qiita.com/ohtaman/items/0c383da89516d03c3ac0 (深層学習で数理最適化問題を解く [前編])
Implementation
・https://github.com/Rintarooo/VRP_MHA (my own TensorFlow 2 Implementation)
・ https://github.com/wouterkool/attention-learn-to-route (Official PyTorch implementation)
・ https://github.com/alexeypustynnikov/AM-VRP
・ https://github.com/d-eremeev/ADM-VRP
8
42
9
3
5
7
6
1
19

More Related Content

PDF
BigData_Chp4: NOSQL
PDF
Bases de Données non relationnelles, NoSQL (Introduction) 1er cours
PPTX
introduction à MongoDB
PPTX
Diabetes Mellitus
PPTX
Hypertension
PPTX
Republic Act No. 11313 Safe Spaces Act (Bawal Bastos Law).pptx
PPTX
Power Point Presentation on Artificial Intelligence
BigData_Chp4: NOSQL
Bases de Données non relationnelles, NoSQL (Introduction) 1er cours
introduction à MongoDB
Diabetes Mellitus
Hypertension
Republic Act No. 11313 Safe Spaces Act (Bawal Bastos Law).pptx
Power Point Presentation on Artificial Intelligence

What's hot (20)

PDF
Paper study: Attention, learn to solve routing problems!
PPTX
CODAGE.pptx
PPT
Interpolation functions
PDF
Solving Traveling Salesman problem using genetic algorithms, implementation i...
PPTX
Runge kutta method -by Prof.Prashant Goad(R.C.Patel Institute of Technology,...
PDF
Chapitre 3 la recherche tabou
PDF
Meta-learning and the ELBO
PPTX
Initiation à l'algorithmique
PDF
Chapitre 3 NP-complétude
PDF
harris corner detector
PPTX
Dijkstra
PDF
Ridge regression, lasso and elastic net
PPTX
Recherche à voisinage variable
PDF
Théorie de graphe
PDF
optimisation cours.pdf
PPTX
Algorithme Colonie de fourmis
PDF
오토인코더의 모든 것
PDF
Devoirs Algorithme + correction pour 4 si
PDF
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
PDF
Chapitre iv algorithmes de tri
Paper study: Attention, learn to solve routing problems!
CODAGE.pptx
Interpolation functions
Solving Traveling Salesman problem using genetic algorithms, implementation i...
Runge kutta method -by Prof.Prashant Goad(R.C.Patel Institute of Technology,...
Chapitre 3 la recherche tabou
Meta-learning and the ELBO
Initiation à l'algorithmique
Chapitre 3 NP-complétude
harris corner detector
Dijkstra
Ridge regression, lasso and elastic net
Recherche à voisinage variable
Théorie de graphe
optimisation cours.pdf
Algorithme Colonie de fourmis
오토인코더의 모든 것
Devoirs Algorithme + correction pour 4 si
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
Chapitre iv algorithmes de tri
Ad

Similar to CVRP solver with Multi-Head Attention (20)

PPTX
20180831 riemannian representation learning
PDF
BEADS : filtrage asymétrique de ligne de base (tendance) et débruitage pour d...
PPT
Horizontal alignment of Roads
PDF
Sampling-Based Planning Algorithms for Multi-Objective Missions
PPTX
Microprocessor Week 7: Branch Instruction
PPTX
Cab travel time prediction using ensemble models
PPTX
Reed solomon Encoder and Decoder
PDF
Computer graphics lab manual
PPT
Cgo2007 P3 3 Birkbeck
PPT
A Dimension Abstraction Approach to Vectorization in Matlab
PDF
"Incremental Lossless Graph Summarization", KDD 2020
PDF
2014-06-20 Multinomial Logistic Regression with Apache Spark
PDF
1. Regression_V1.pdf
PDF
CD504 CGM_Lab Manual_004e08d3838702ed11fc6d03cc82f7be.pdf
PPT
ECCV2008: MAP Estimation Algorithms in Computer Vision - Part 2
PPTX
R Language Introduction
PDF
Supply chain logistics : vehicle routing and scheduling
PDF
Efficient Volume and Edge-Skeleton Computation for Polytopes Given by Oracles
PDF
Extended network and algorithm finding maximal flows
PDF
Computational Intelligence Assisted Engineering Design Optimization (using MA...
20180831 riemannian representation learning
BEADS : filtrage asymétrique de ligne de base (tendance) et débruitage pour d...
Horizontal alignment of Roads
Sampling-Based Planning Algorithms for Multi-Objective Missions
Microprocessor Week 7: Branch Instruction
Cab travel time prediction using ensemble models
Reed solomon Encoder and Decoder
Computer graphics lab manual
Cgo2007 P3 3 Birkbeck
A Dimension Abstraction Approach to Vectorization in Matlab
"Incremental Lossless Graph Summarization", KDD 2020
2014-06-20 Multinomial Logistic Regression with Apache Spark
1. Regression_V1.pdf
CD504 CGM_Lab Manual_004e08d3838702ed11fc6d03cc82f7be.pdf
ECCV2008: MAP Estimation Algorithms in Computer Vision - Part 2
R Language Introduction
Supply chain logistics : vehicle routing and scheduling
Efficient Volume and Edge-Skeleton Computation for Polytopes Given by Oracles
Extended network and algorithm finding maximal flows
Computational Intelligence Assisted Engineering Design Optimization (using MA...
Ad

Recently uploaded (20)

PPTX
UNIT 4 Total Quality Management .pptx
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PPTX
Welding lecture in detail for understanding
PDF
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PPTX
additive manufacturing of ss316l using mig welding
PDF
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PPTX
bas. eng. economics group 4 presentation 1.pptx
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
DOCX
573137875-Attendance-Management-System-original
PDF
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PPTX
Sustainable Sites - Green Building Construction
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PPTX
Lecture Notes Electrical Wiring System Components
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PPTX
OOP with Java - Java Introduction (Basics)
UNIT 4 Total Quality Management .pptx
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
Welding lecture in detail for understanding
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
additive manufacturing of ss316l using mig welding
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
bas. eng. economics group 4 presentation 1.pptx
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
573137875-Attendance-Management-System-original
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
Sustainable Sites - Green Building Construction
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
Lecture Notes Electrical Wiring System Components
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
OOP with Java - Java Introduction (Basics)

CVRP solver with Multi-Head Attention