SlideShare a Scribd company logo
A hybrid deep learning approach to vertexing
Rui Fang1
Henry Schreiner1, 2
Mike Sokoloff1
Constantin Weisser3
Mike Williams3
April 3, 2019
1
The University of Cincinnati
2
Princeton University
3
Massachusetts Institute of Technology
CtD/WIT 2019
Supported by:
0 5 10 15 20 25 30 35 40 45 50 55 60
# LHCb long tracks
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
Efficiency
Found 103002 of 109733 (eff 93.87%)
False positive rate = 0.251 per event
Asymmetric cost function
Found 96616 of 109733 (eff 88.05%)
False positive rate = 0.0485 per event
Symmetric cost function
Events in sample = 20K
Training sample = 240K
0 5 10 15 20 25 30 35 40 45 50 55 60
# LHCb long tracks
102
103
104
PVs
1/16Fang, Schreiner, Sokoloff, Weisser, Williams
A hybrid deep learning approach to vertexing
April 3, 2019
Tracking in the LHCb upgrade Introduction
The changes
• 30 MHz software trigger
• 7.6 PVs per event (Poisson distribution)
• Roughly 5.5 visible PVs per event
The problem
• Much higher pileup
• Very little time to do the tracking
• Current algorithms too slow
We need to rethink our algorithms from the ground up...
2/16Fang, Schreiner, Sokoloff, Weisser, Williams
A hybrid deep learning approach to vertexing
April 3, 2019
Vertices and tracks Introduction
Vertices
• Events contain ≈ 7 Primary Vertices (≈ 5
visible PVs)
A PV should contain 5+ long tracks
• Multiple Secondary Vertices (SVs) per
event as well
A SV should contain 2+ tracks
Beams
PV
Track
SV
Adapt to machine learning?
• Sparse 3D data (41M pixels) → rich 1D data
• 1D convolutional neural nets
• Highly parallelizable, GPU friendly
• Opportunities to visualize learning process
3/16Fang, Schreiner, Sokoloff, Weisser, Williams
A hybrid deep learning approach to vertexing
April 3, 2019
A hybrid ML approach Introduction
Tracking Kernel generation Make predictions
CNNs
Interpret results
Truth Training
Validation
Machine learning features (so far)
• Prototracking converts sparse 3D dataset to feature-rich 1D dataset
• Easy and effective visualization due to 1D nature
• Even simple networks can provide interesting results
What follows is a proof of principle implementation for finding PVs.
4/16Fang, Schreiner, Sokoloff, Weisser, Williams
A hybrid deep learning approach to vertexing
April 3, 2019
Kernel generation Design
Tracking procedure
• Hits lie on the 26 planes
• For simplicity, only 3 tracks shown
z axis (along the beam)
x PV
5/16Fang, Schreiner, Sokoloff, Weisser, Williams
A hybrid deep learning approach to vertexing
April 3, 2019
Kernel generation Design
Tracking procedure
• Hits lie on the 26 planes
• For simplicity, only 3 tracks shown
• Make a 3D grid of voxels (2D shown)
• Note: only z will be fully calculated and
stored
z axis (along the beam)
x PV
5/16Fang, Schreiner, Sokoloff, Weisser, Williams
A hybrid deep learning approach to vertexing
April 3, 2019
Kernel generation Design
Tracking procedure
• Hits lie on the 26 planes
• For simplicity, only 3 tracks shown
• Make a 3D grid of voxels (2D shown)
• Note: only z will be fully calculated and
stored
• Tracking (full or partial)
z axis (along the beam)
x PV
5/16Fang, Schreiner, Sokoloff, Weisser, Williams
A hybrid deep learning approach to vertexing
April 3, 2019
Kernel generation Design
Tracking procedure
• Hits lie on the 26 planes
• For simplicity, only 3 tracks shown
• Make a 3D grid of voxels (2D shown)
• Note: only z will be fully calculated and
stored
• Tracking (full or partial)
• Fill in each voxel center with Gaussian PDF
z axis (along the beam)
x PV
5/16Fang, Schreiner, Sokoloff, Weisser, Williams
A hybrid deep learning approach to vertexing
April 3, 2019
Kernel generation Design
Tracking procedure
• Hits lie on the 26 planes
• For simplicity, only 3 tracks shown
• Make a 3D grid of voxels (2D shown)
• Note: only z will be fully calculated and
stored
• Tracking (full or partial)
• Fill in each voxel center with Gaussian PDF
• PDF for each (proto)track is combined
z axis (along the beam)
x PV
5/16Fang, Schreiner, Sokoloff, Weisser, Williams
A hybrid deep learning approach to vertexing
April 3, 2019
Kernel generation Design
Tracking procedure
• Hits lie on the 26 planes
• For simplicity, only 3 tracks shown
• Make a 3D grid of voxels (2D shown)
• Note: only z will be fully calculated and
stored
• Tracking (full or partial)
• Fill in each voxel center with Gaussian PDF
• PDF for each (proto)track is combined
• Fill z “histogram” with maximum KDE value
in xy
z axis (along the beam)
x
Kernel
PV
5/16Fang, Schreiner, Sokoloff, Weisser, Williams
A hybrid deep learning approach to vertexing
April 3, 2019
Example of z KDE histogram Design
100 50 0 50 100 150 200 250 300
z values [mm]
0
500
1000
1500
2000
DensityofKernel
Kernel
LHCb PVs
Other PVs
LHCb SVs
Other SVs
Note: All events from toy detector simulation
Human learning
• Peaks generally correspond to PVs and SVs
Challenges
• Vertex may be offset from peak
• Vertices interact
6/16Fang, Schreiner, Sokoloff, Weisser, Williams
A hybrid deep learning approach to vertexing
April 3, 2019
Target distribution Design
Build target distribution
• True PV position as the mean of Gaussian
• σ (standard deviation) is 100 µm (simplification)
• Fill bins with integrated PDF within ±3 bins (±300 µm)
7/16Fang, Schreiner, Sokoloff, Weisser, Williams
A hybrid deep learning approach to vertexing
April 3, 2019
Neural network architecture Design
Inputs
1
2
3
· · ·
25
26
· · ·
3998
3999
4000
Convolution
Width:
25
Channels:
1 → 25
25 Channels
1
2
3
· · ·
15
16
· · ·
3998
3999
4000
Convolution
Width:
15
Channels:
25 → 25
25 Channels
1
2
3
· · ·
15
16
· · ·
3998
3999
4000
Convolution
Width:
15
Channels:
25 → 25
25 Channels
1
2
3
4
5
6
· · ·
3998
3999
4000
Convolution
Width:
5
Channels:
25 → 1
1 Channel
1
2
3
· · ·
91
92
· · ·
3998
3999
4000
Convolution
Width:
91
Channels:
1 → 1
Output
1
2
3
4
5
· · ·
3997
3998
3999
4000
-x x
y
Leaky relu
-x x
y
Leaky relu
-x x
y
Leaky relu
-x x
y
Leaky relu
-x x
y
Softplus
8/16Fang, Schreiner, Sokoloff, Weisser, Williams
A hybrid deep learning approach to vertexing
April 3, 2019
Cost function Design
10 6 10 5 10 4 10 3 10 2 10 1 100
yhat
0
10
20
30
40
50
60
cost
0.0 0.2 0.4 0.6 0.8
yhat
0
5
10
15
20
25
30
cost
Asym. Cost for y = 0.10
Symm. Cost for y = 0.10
Asym. Cost for y = 0.30
Symm. Cost for y = 0.30
Asym. Cost for y = 1e-5
Symm. Cost for y = 1e-5
0.2 0.4 0.6 0.8 1.0
yhat
0
2
4
6
8
10
cost
Approach
• Symmetric cost function: low FP but low efficiency
• Adding asymmetry term controls trade-off for FP vs. efficiency
9/16Fang, Schreiner, Sokoloff, Weisser, Williams
A hybrid deep learning approach to vertexing
April 3, 2019
False Positive and efficiency rates Results
88 89 90 91 92 93 94
Efficiency [%]
0.05
0.10
0.15
0.20
0.25FPperevent
Symm cost
Most asymm cost
88 89 90 91 92 93 94
Efficiency [%]
10 1
6×10 2
2×10 1
FPperevent
Symm cost
Most asymm cost
Search for PVs (handwritten, maybe not optimial)
• Search ±5 bins (±500µm) around a true PV
• At least 3 bins with predicted probability > 1% and
integrated probability > 20%.
Tunable efficiency vs. FP
• The asymmetry parameter
controls FP vs. efficiency
10/16Fang, Schreiner, Sokoloff, Weisser, Williams
A hybrid deep learning approach to vertexing
April 3, 2019
Compare predictions with targets: Examples Results
0
100
200
300
400
500
KernelDensity
True: 197.461 mm
Pred: 197.396 mm
: -65 µm
Event 5 @ 197.4 mm: PV found
Kernel Density
195.00 196.00 197.00 198.00 199.00
z values [mm]
150
100
50
0
50
100
150
xymaximum[m]
x
y
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Probability
Target
Predicted
Masked
PV found example
0
200
400
600
800
1000
1200
1400
1600
KernelDensity
True: 36.068 mm
Pred: 36.400 mm
: 332 µm
Event 6 @ 36.1 mm: PV found
Kernel Density
34.00 35.00 36.00 37.00 38.00
z values [mm]
150
100
50
0
50
100
150
xymaximum[m]
x
y
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Probability
Target
Predicted
Masked
PV found example
11/16Fang, Schreiner, Sokoloff, Weisser, Williams
A hybrid deep learning approach to vertexing
April 3, 2019
Compare predictions with targets: When it works Results
0
200
400
600
800
1000
1200
KernelDensity
True: 48.904 mm
Pred: 48.954 mm
: 50 µm
Event 0 @ 48.9 mm: PV found
Kernel Density
47.00 48.00 49.00 50.00 51.00
z values [mm]
150
100
50
0
50
100
150
xymaximum[m]
x
y
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Probability
Target
Predicted
Masked
PV found example
0
50
100
150
200
KernelDensity
Pred: 0.976 mm
Event 0 @ 1.0 mm: Masked
Kernel Density
-1.00 0.00 1.00 2.00 3.00
z values [mm]
150
100
50
0
50
100
150
xymaximum[m]
x
y
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Probability
Target
Predicted
Masked
Masked (<5 tracks) example
12/16Fang, Schreiner, Sokoloff, Weisser, Williams
A hybrid deep learning approach to vertexing
April 3, 2019
Compare predictions with targets: When it fails Results
0
50
100
150
200
250
KernelDensity
Pred: 65.696 mm
Event 2 @ 65.7 mm: False positive
Kernel Density
64.00 65.00 66.00 67.00 68.00
z values [mm]
150
100
50
0
50
100
150
xymaximum[m]
x
y
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Probability
Target
Predicted
False Positive example
0
100
200
300
400
500
KernelDensity
True: 51.898 mm
Event 3 @ 51.9 mm: PV not found
Kernel Density
50.00 51.00 52.00 53.00 54.00
z values [mm]
150
100
50
0
50
100
150
xymaximum[m]
x
y
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Probability
Target
Predicted
Masked
PV not found example
13/16Fang, Schreiner, Sokoloff, Weisser, Williams
A hybrid deep learning approach to vertexing
April 3, 2019
Future addition: xy information Future plans
Adding xy information
• Point of maximum z in xy available
• Extra information: sharp discontinuities
between PVs
• Need iterative approach or “reduced
importance”
What about a full 2D kernel?
• Not needed for LHCb currently (large xy,
“low” z overlap)
• Might be useful for other detectors!
0
500
1000
1500
2000
KernelDensity
True: 114.622 mm
Pred: 114.597 mm
: -26 µm
Event 2 @ 114.6 mm: PV found
Kernel Density
113.00 114.00 115.00 116.00 117.00
z values [mm]
150
100
50
0
50
100
150
xymaximum[m]
x
y
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Probability
Target
Predicted
14/16Fang, Schreiner, Sokoloff, Weisser, Williams
A hybrid deep learning approach to vertexing
April 3, 2019
Conclusions and plans Future plans
0 5 10 15 20 25 30 35 40 45 50 55 60
# LHCb long tracks
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
Efficiency
• Proof-of-Principle established: a hybrid ML algorithm
using a 1-dimensional KDE processed by a 5-layer CNN
finds primary vertices with efficiencies and false positive
rates similar to traditional algorithms.
• Efficiency is tunable; increasing the efficiency also
increases the false positive rate.
• Adding information should improve performance.
• can add KDE (x,y) information to algorithm
• can associate tracks to PV candidates, then iterate.
• Next steps: train with full LHCb MC and deploy
inference engine in LHCb Hlt1 framework.
• Beyond LHCb
• approach might work for ATLAS and CMS (in 2D?);
• algorithm is an interesting ML laboratory.
15/16Fang, Schreiner, Sokoloff, Weisser, Williams
A hybrid deep learning approach to vertexing
April 3, 2019
Final words Future plans
Source code:
• https://gitlab.cern.ch/LHCb-Reco-Dev/pv-finder
• Runnable with Conda on macOS and Linux
Run: conda env create -f environment-gpu.yml
Python 3.6+ and PyTorch used for machine learning code
Generation now available too using the new Conda-Forge
ROOT and Pythia8 packages
Supported by:
• NSF OAC-1836650:
IRIS-HEP
• NSF OAC-1740102:
SI2:SSE
• NSF OAC-1739772:
SI2:SSE
16/16Fang, Schreiner, Sokoloff, Weisser, Williams
A hybrid deep learning approach to vertexing
April 3, 2019
Final words Future plans
Questions?
Source code:
• https://gitlab.cern.ch/LHCb-Reco-Dev/pv-finder
• Runnable with Conda on macOS and Linux
Run: conda env create -f environment-gpu.yml
Python 3.6+ and PyTorch used for machine learning code
Generation now available too using the new Conda-Forge
ROOT and Pythia8 packages
Supported by:
• NSF OAC-1836650:
IRIS-HEP
• NSF OAC-1740102:
SI2:SSE
• NSF OAC-1739772:
SI2:SSE
16/16Fang, Schreiner, Sokoloff, Weisser, Williams
A hybrid deep learning approach to vertexing
April 3, 2019
More predictions with targets (1) Backup
0
50
100
150
200
KernelDensity
True: 221.595 mm
Pred: 221.546 mm
: -49 µm
Event 5 @ 221.5 mm: PV found
Kernel Density
219.00 220.00 221.00 222.00 223.00 224.00
z values [mm]
150
100
50
0
50
100
150
xymaximum[m]
x
y
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Probability
Target
Predicted
Masked
0
500
1000
1500
2000
KernelDensity
True: 114.622 mm
Pred: 114.597 mm
: -26 µm
Event 2 @ 114.6 mm: PV found
Kernel Density
113.00 114.00 115.00 116.00 117.00
z values [mm]
150
100
50
0
50
100
150
xymaximum[m]
x
y
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Probability
Target
Predicted
17/16Fang, Schreiner, Sokoloff, Weisser, Williams
A hybrid deep learning approach to vertexing
April 3, 2019
More predictions with targets (2) Backup
0
200
400
600
800
1000
1200
1400
1600
KernelDensity
True: 129.336 mm
Pred: 129.337 mm
: 1 µm
Event 6 @ 129.3 mm: PV found
Kernel Density
127.00 128.00 129.00 130.00 131.00
z values [mm]
150
100
50
0
50
100
150
xymaximum[m]
x
y
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Probability
Target
Predicted
Masked
0
500
1000
1500
2000
KernelDensity
True: 143.224 mm
Pred: 143.199 mm
: -25 µm
Event 6 @ 143.2 mm: PV found
Kernel Density
141.00 142.00 143.00 144.00 145.00
z values [mm]
150
100
50
0
50
100
150
xymaximum[m]
x
y
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Probability
Target
Predicted
Masked
18/16Fang, Schreiner, Sokoloff, Weisser, Williams
A hybrid deep learning approach to vertexing
April 3, 2019
More predictions with targets (3) Backup
0
50
100
150
200
250
300
350
400
KernelDensity
True: 150.650 mm
Pred: 150.416 mm
: -234 µm
Event 6 @ 150.4 mm: PV found
Kernel Density
148.00 149.00 150.00 151.00 152.00
z values [mm]
150
100
50
0
50
100
150
xymaximum[m]
x
y
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Probability
Target
Predicted
Masked
0
500
1000
1500
2000
2500
KernelDensity
True: 179.560 mm
Pred: 179.591 mm
: 31 µm
Event 6 @ 179.6 mm: PV found
Kernel Density
178.00 179.00 180.00 181.00 182.00
z values [mm]
150
100
50
0
50
100
150
xymaximum[m]
x
y
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Probability
Target
Predicted
Masked
19/16Fang, Schreiner, Sokoloff, Weisser, Williams
A hybrid deep learning approach to vertexing
April 3, 2019
The VELO Backup
Tracks
• Originate from vertices (not shown)
• Hits originate from tracks
• We only know the true track in simulation
• Nearly straight, but tracks may scatter in material
The VELO
• A set of 26 planes that detect tracks
• Tracks should hit one or more pixels per plane
• Sparse 3D dataset (41M pixels)
20/16Fang, Schreiner, Sokoloff, Weisser, Williams
A hybrid deep learning approach to vertexing
April 3, 2019
Questions for other experiments Backup
• Beam width (x, y): 40 µm for LHCb, what is yours?
• Transverse resolution: 5–15 µm for LHCb depending on number of tracks, what is yours?
• Longitudinal resolution: 40–100 µm for LHCb depending on number of tracks, what is
yours?
• Cleaning up prototracks based on IP could simplify kernel
• Can prototracking be done in the triggers?
21/16Fang, Schreiner, Sokoloff, Weisser, Williams
A hybrid deep learning approach to vertexing
April 3, 2019

More Related Content

PDF
ACAT 2019: A hybrid deep learning approach to vertexing
PDF
HOW 2019: Machine Learning for the Primary Vertex Reconstruction
PDF
LHCb Computing Workshop 2018: PV finding with CNNs
PPTX
Evaluation of geometrical parameters of buildings from SAR images
PPT
Using Very High Resolution Satellite Images for Planning Activities in Mining
PDF
ES_SAA_OG_PF_ECCTD_Pos
PPT
Renewable energy course#02
PPT
Renewable energy course#02 gen
ACAT 2019: A hybrid deep learning approach to vertexing
HOW 2019: Machine Learning for the Primary Vertex Reconstruction
LHCb Computing Workshop 2018: PV finding with CNNs
Evaluation of geometrical parameters of buildings from SAR images
Using Very High Resolution Satellite Images for Planning Activities in Mining
ES_SAA_OG_PF_ECCTD_Pos
Renewable energy course#02
Renewable energy course#02 gen

What's hot (19)

PPTX
Esmaeilzade sampling
PDF
Neighbourhood Preserving Quantisation for LSH SIGIR Poster
PDF
1414 15 w 4meters all in one di
PDF
GoogLeNet Insights
PDF
Goddard-DR-2010
PPTX
Direction of Arrival (DOA) Estimation With Two Element Antennas
PDF
Adaptive Channel Prediction, Beamforming and Scheduling Design for 5G V2I Net...
PPTX
Distance and Time Based Node Selection for Probabilistic Coverage in People-C...
PDF
Touya et al_issdq_presentation
PPTX
Real-Time Visual Simulation of Smoke
PPTX
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013
PDF
“Introduction to Simultaneous Localization and Mapping (SLAM),” a Presentatio...
PPTX
Сегментация объектов на спутниковых снимках (Kaggle DSTL) / Артур Кузин (Avito)
PDF
Scratch to Supercomputers: Bottoms-up Build of Large-scale Computational Lens...
PDF
Large scale landuse classification of satellite imagery
PDF
Landuse Classification from Satellite Imagery using Deep Learning
PPT
An Overview of HDF-EOS (Part 1)
PDF
DSD-INT 2015 - 3Di pilot application in Taiwan - Jhih-Cyuan Shen, Geert Prinsen
Esmaeilzade sampling
Neighbourhood Preserving Quantisation for LSH SIGIR Poster
1414 15 w 4meters all in one di
GoogLeNet Insights
Goddard-DR-2010
Direction of Arrival (DOA) Estimation With Two Element Antennas
Adaptive Channel Prediction, Beamforming and Scheduling Design for 5G V2I Net...
Distance and Time Based Node Selection for Probabilistic Coverage in People-C...
Touya et al_issdq_presentation
Real-Time Visual Simulation of Smoke
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013
“Introduction to Simultaneous Localization and Mapping (SLAM),” a Presentatio...
Сегментация объектов на спутниковых снимках (Kaggle DSTL) / Артур Кузин (Avito)
Scratch to Supercomputers: Bottoms-up Build of Large-scale Computational Lens...
Large scale landuse classification of satellite imagery
Landuse Classification from Satellite Imagery using Deep Learning
An Overview of HDF-EOS (Part 1)
DSD-INT 2015 - 3Di pilot application in Taiwan - Jhih-Cyuan Shen, Geert Prinsen
Ad

Similar to 2019 CtD: A hybrid deep learning approach to vertexing (20)

PDF
2019 IML workshop: A hybrid deep learning approach to vertexing
PDF
GTC Taiwan 2017 GPU 平台上導入深度學習於半導體產業之 EDA 應用
PDF
Compressive Imaging Structure Sampling Learning 1st Edition Ben Adcock
PDF
MSEE Defense
PDF
Decision Forests and discriminant analysis
PPT
Dream3D and its Extension to Abaqus Input Files
PPT
IllinoisScan_seminar.ppt
PDF
EfficientML.ai Lecture Neural Architecture Search
PDF
"Separable Convolutions for Efficient Implementation of CNNs and Other Vision...
PDF
Maxim Integrated MAX21000 3-Axis MEMS Gyroscope teardown reverse costing repo...
PDF
Learning From a Few Large-Scale Partial Examples: Computational Tools, Regul...
PPTX
LRP for hand gesture recogntion.pptx
PPT
Antenna synthesis
PPTX
Topic 4.1
PDF
Big Data Competition: maximizing your potential
 exampled with the 2014 Higgs...
PDF
Deep Learning-Based Universal Beamformer for Ultrasound Imaging
PPT
Talk g siringo_laboca_spie20080626_last
PDF
IRJET- Synchronization Scheme of MIMO-OFDM using Monte Carlo Method
PPTX
From APECE to ASML A Semiconductor Journey
PDF
Coping With Interference In Wireless Networks Kazemitabar Seyed Javad
2019 IML workshop: A hybrid deep learning approach to vertexing
GTC Taiwan 2017 GPU 平台上導入深度學習於半導體產業之 EDA 應用
Compressive Imaging Structure Sampling Learning 1st Edition Ben Adcock
MSEE Defense
Decision Forests and discriminant analysis
Dream3D and its Extension to Abaqus Input Files
IllinoisScan_seminar.ppt
EfficientML.ai Lecture Neural Architecture Search
"Separable Convolutions for Efficient Implementation of CNNs and Other Vision...
Maxim Integrated MAX21000 3-Axis MEMS Gyroscope teardown reverse costing repo...
Learning From a Few Large-Scale Partial Examples: Computational Tools, Regul...
LRP for hand gesture recogntion.pptx
Antenna synthesis
Topic 4.1
Big Data Competition: maximizing your potential
 exampled with the 2014 Higgs...
Deep Learning-Based Universal Beamformer for Ultrasound Imaging
Talk g siringo_laboca_spie20080626_last
IRJET- Synchronization Scheme of MIMO-OFDM using Monte Carlo Method
From APECE to ASML A Semiconductor Journey
Coping With Interference In Wireless Networks Kazemitabar Seyed Javad
Ad

More from Henry Schreiner (20)

PDF
SciPy 2025 - Packaging a Scientific Python Project
PDF
Tools That Help You Write Better Code - 2025 Princeton Software Engineering S...
PDF
Princeton RSE: Building Python Packages (+binary)
PDF
Tools to help you write better code - Princeton Wintersession
PDF
Learning Rust with Advent of Code 2023 - Princeton
PDF
The two flavors of Python 3.13 - PyHEP 2024
PDF
Modern binary build systems - PyCon 2024
PDF
Software Quality Assurance Tooling - Wintersession 2024
PDF
Princeton RSE Peer network first meeting
PDF
Software Quality Assurance Tooling 2023
PDF
Princeton Wintersession: Software Quality Assurance Tooling
PDF
What's new in Python 3.11
PDF
Everything you didn't know you needed
PDF
SciPy22 - Building binary extensions with pybind11, scikit build, and cibuild...
PDF
SciPy 2022 Scikit-HEP
PDF
PyCon 2022 -Scikit-HEP Developer Pages: Guidelines for modern packaging
PDF
PyCon2022 - Building Python Extensions
PDF
boost-histogram / Hist: PyHEP Topical meeting
PDF
Digital RSE: automated code quality checks - RSE group meeting
PDF
CMake best practices
SciPy 2025 - Packaging a Scientific Python Project
Tools That Help You Write Better Code - 2025 Princeton Software Engineering S...
Princeton RSE: Building Python Packages (+binary)
Tools to help you write better code - Princeton Wintersession
Learning Rust with Advent of Code 2023 - Princeton
The two flavors of Python 3.13 - PyHEP 2024
Modern binary build systems - PyCon 2024
Software Quality Assurance Tooling - Wintersession 2024
Princeton RSE Peer network first meeting
Software Quality Assurance Tooling 2023
Princeton Wintersession: Software Quality Assurance Tooling
What's new in Python 3.11
Everything you didn't know you needed
SciPy22 - Building binary extensions with pybind11, scikit build, and cibuild...
SciPy 2022 Scikit-HEP
PyCon 2022 -Scikit-HEP Developer Pages: Guidelines for modern packaging
PyCon2022 - Building Python Extensions
boost-histogram / Hist: PyHEP Topical meeting
Digital RSE: automated code quality checks - RSE group meeting
CMake best practices

Recently uploaded (20)

PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Encapsulation theory and applications.pdf
PDF
Empathic Computing: Creating Shared Understanding
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Machine learning based COVID-19 study performance prediction
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Modernizing your data center with Dell and AMD
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
MYSQL Presentation for SQL database connectivity
PDF
KodekX | Application Modernization Development
Network Security Unit 5.pdf for BCA BBA.
Encapsulation theory and applications.pdf
Empathic Computing: Creating Shared Understanding
CIFDAQ's Market Insight: SEC Turns Pro Crypto
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Per capita expenditure prediction using model stacking based on satellite ima...
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
NewMind AI Weekly Chronicles - August'25 Week I
Machine learning based COVID-19 study performance prediction
Encapsulation_ Review paper, used for researhc scholars
Modernizing your data center with Dell and AMD
Dropbox Q2 2025 Financial Results & Investor Presentation
Building Integrated photovoltaic BIPV_UPV.pdf
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
MYSQL Presentation for SQL database connectivity
KodekX | Application Modernization Development

2019 CtD: A hybrid deep learning approach to vertexing

  • 1. A hybrid deep learning approach to vertexing Rui Fang1 Henry Schreiner1, 2 Mike Sokoloff1 Constantin Weisser3 Mike Williams3 April 3, 2019 1 The University of Cincinnati 2 Princeton University 3 Massachusetts Institute of Technology CtD/WIT 2019 Supported by:
  • 2. 0 5 10 15 20 25 30 35 40 45 50 55 60 # LHCb long tracks 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 Efficiency Found 103002 of 109733 (eff 93.87%) False positive rate = 0.251 per event Asymmetric cost function Found 96616 of 109733 (eff 88.05%) False positive rate = 0.0485 per event Symmetric cost function Events in sample = 20K Training sample = 240K 0 5 10 15 20 25 30 35 40 45 50 55 60 # LHCb long tracks 102 103 104 PVs 1/16Fang, Schreiner, Sokoloff, Weisser, Williams A hybrid deep learning approach to vertexing April 3, 2019
  • 3. Tracking in the LHCb upgrade Introduction The changes • 30 MHz software trigger • 7.6 PVs per event (Poisson distribution) • Roughly 5.5 visible PVs per event The problem • Much higher pileup • Very little time to do the tracking • Current algorithms too slow We need to rethink our algorithms from the ground up... 2/16Fang, Schreiner, Sokoloff, Weisser, Williams A hybrid deep learning approach to vertexing April 3, 2019
  • 4. Vertices and tracks Introduction Vertices • Events contain ≈ 7 Primary Vertices (≈ 5 visible PVs) A PV should contain 5+ long tracks • Multiple Secondary Vertices (SVs) per event as well A SV should contain 2+ tracks Beams PV Track SV Adapt to machine learning? • Sparse 3D data (41M pixels) → rich 1D data • 1D convolutional neural nets • Highly parallelizable, GPU friendly • Opportunities to visualize learning process 3/16Fang, Schreiner, Sokoloff, Weisser, Williams A hybrid deep learning approach to vertexing April 3, 2019
  • 5. A hybrid ML approach Introduction Tracking Kernel generation Make predictions CNNs Interpret results Truth Training Validation Machine learning features (so far) • Prototracking converts sparse 3D dataset to feature-rich 1D dataset • Easy and effective visualization due to 1D nature • Even simple networks can provide interesting results What follows is a proof of principle implementation for finding PVs. 4/16Fang, Schreiner, Sokoloff, Weisser, Williams A hybrid deep learning approach to vertexing April 3, 2019
  • 6. Kernel generation Design Tracking procedure • Hits lie on the 26 planes • For simplicity, only 3 tracks shown z axis (along the beam) x PV 5/16Fang, Schreiner, Sokoloff, Weisser, Williams A hybrid deep learning approach to vertexing April 3, 2019
  • 7. Kernel generation Design Tracking procedure • Hits lie on the 26 planes • For simplicity, only 3 tracks shown • Make a 3D grid of voxels (2D shown) • Note: only z will be fully calculated and stored z axis (along the beam) x PV 5/16Fang, Schreiner, Sokoloff, Weisser, Williams A hybrid deep learning approach to vertexing April 3, 2019
  • 8. Kernel generation Design Tracking procedure • Hits lie on the 26 planes • For simplicity, only 3 tracks shown • Make a 3D grid of voxels (2D shown) • Note: only z will be fully calculated and stored • Tracking (full or partial) z axis (along the beam) x PV 5/16Fang, Schreiner, Sokoloff, Weisser, Williams A hybrid deep learning approach to vertexing April 3, 2019
  • 9. Kernel generation Design Tracking procedure • Hits lie on the 26 planes • For simplicity, only 3 tracks shown • Make a 3D grid of voxels (2D shown) • Note: only z will be fully calculated and stored • Tracking (full or partial) • Fill in each voxel center with Gaussian PDF z axis (along the beam) x PV 5/16Fang, Schreiner, Sokoloff, Weisser, Williams A hybrid deep learning approach to vertexing April 3, 2019
  • 10. Kernel generation Design Tracking procedure • Hits lie on the 26 planes • For simplicity, only 3 tracks shown • Make a 3D grid of voxels (2D shown) • Note: only z will be fully calculated and stored • Tracking (full or partial) • Fill in each voxel center with Gaussian PDF • PDF for each (proto)track is combined z axis (along the beam) x PV 5/16Fang, Schreiner, Sokoloff, Weisser, Williams A hybrid deep learning approach to vertexing April 3, 2019
  • 11. Kernel generation Design Tracking procedure • Hits lie on the 26 planes • For simplicity, only 3 tracks shown • Make a 3D grid of voxels (2D shown) • Note: only z will be fully calculated and stored • Tracking (full or partial) • Fill in each voxel center with Gaussian PDF • PDF for each (proto)track is combined • Fill z “histogram” with maximum KDE value in xy z axis (along the beam) x Kernel PV 5/16Fang, Schreiner, Sokoloff, Weisser, Williams A hybrid deep learning approach to vertexing April 3, 2019
  • 12. Example of z KDE histogram Design 100 50 0 50 100 150 200 250 300 z values [mm] 0 500 1000 1500 2000 DensityofKernel Kernel LHCb PVs Other PVs LHCb SVs Other SVs Note: All events from toy detector simulation Human learning • Peaks generally correspond to PVs and SVs Challenges • Vertex may be offset from peak • Vertices interact 6/16Fang, Schreiner, Sokoloff, Weisser, Williams A hybrid deep learning approach to vertexing April 3, 2019
  • 13. Target distribution Design Build target distribution • True PV position as the mean of Gaussian • σ (standard deviation) is 100 µm (simplification) • Fill bins with integrated PDF within ±3 bins (±300 µm) 7/16Fang, Schreiner, Sokoloff, Weisser, Williams A hybrid deep learning approach to vertexing April 3, 2019
  • 14. Neural network architecture Design Inputs 1 2 3 · · · 25 26 · · · 3998 3999 4000 Convolution Width: 25 Channels: 1 → 25 25 Channels 1 2 3 · · · 15 16 · · · 3998 3999 4000 Convolution Width: 15 Channels: 25 → 25 25 Channels 1 2 3 · · · 15 16 · · · 3998 3999 4000 Convolution Width: 15 Channels: 25 → 25 25 Channels 1 2 3 4 5 6 · · · 3998 3999 4000 Convolution Width: 5 Channels: 25 → 1 1 Channel 1 2 3 · · · 91 92 · · · 3998 3999 4000 Convolution Width: 91 Channels: 1 → 1 Output 1 2 3 4 5 · · · 3997 3998 3999 4000 -x x y Leaky relu -x x y Leaky relu -x x y Leaky relu -x x y Leaky relu -x x y Softplus 8/16Fang, Schreiner, Sokoloff, Weisser, Williams A hybrid deep learning approach to vertexing April 3, 2019
  • 15. Cost function Design 10 6 10 5 10 4 10 3 10 2 10 1 100 yhat 0 10 20 30 40 50 60 cost 0.0 0.2 0.4 0.6 0.8 yhat 0 5 10 15 20 25 30 cost Asym. Cost for y = 0.10 Symm. Cost for y = 0.10 Asym. Cost for y = 0.30 Symm. Cost for y = 0.30 Asym. Cost for y = 1e-5 Symm. Cost for y = 1e-5 0.2 0.4 0.6 0.8 1.0 yhat 0 2 4 6 8 10 cost Approach • Symmetric cost function: low FP but low efficiency • Adding asymmetry term controls trade-off for FP vs. efficiency 9/16Fang, Schreiner, Sokoloff, Weisser, Williams A hybrid deep learning approach to vertexing April 3, 2019
  • 16. False Positive and efficiency rates Results 88 89 90 91 92 93 94 Efficiency [%] 0.05 0.10 0.15 0.20 0.25FPperevent Symm cost Most asymm cost 88 89 90 91 92 93 94 Efficiency [%] 10 1 6×10 2 2×10 1 FPperevent Symm cost Most asymm cost Search for PVs (handwritten, maybe not optimial) • Search ±5 bins (±500µm) around a true PV • At least 3 bins with predicted probability > 1% and integrated probability > 20%. Tunable efficiency vs. FP • The asymmetry parameter controls FP vs. efficiency 10/16Fang, Schreiner, Sokoloff, Weisser, Williams A hybrid deep learning approach to vertexing April 3, 2019
  • 17. Compare predictions with targets: Examples Results 0 100 200 300 400 500 KernelDensity True: 197.461 mm Pred: 197.396 mm : -65 µm Event 5 @ 197.4 mm: PV found Kernel Density 195.00 196.00 197.00 198.00 199.00 z values [mm] 150 100 50 0 50 100 150 xymaximum[m] x y 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Probability Target Predicted Masked PV found example 0 200 400 600 800 1000 1200 1400 1600 KernelDensity True: 36.068 mm Pred: 36.400 mm : 332 µm Event 6 @ 36.1 mm: PV found Kernel Density 34.00 35.00 36.00 37.00 38.00 z values [mm] 150 100 50 0 50 100 150 xymaximum[m] x y 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Probability Target Predicted Masked PV found example 11/16Fang, Schreiner, Sokoloff, Weisser, Williams A hybrid deep learning approach to vertexing April 3, 2019
  • 18. Compare predictions with targets: When it works Results 0 200 400 600 800 1000 1200 KernelDensity True: 48.904 mm Pred: 48.954 mm : 50 µm Event 0 @ 48.9 mm: PV found Kernel Density 47.00 48.00 49.00 50.00 51.00 z values [mm] 150 100 50 0 50 100 150 xymaximum[m] x y 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Probability Target Predicted Masked PV found example 0 50 100 150 200 KernelDensity Pred: 0.976 mm Event 0 @ 1.0 mm: Masked Kernel Density -1.00 0.00 1.00 2.00 3.00 z values [mm] 150 100 50 0 50 100 150 xymaximum[m] x y 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Probability Target Predicted Masked Masked (<5 tracks) example 12/16Fang, Schreiner, Sokoloff, Weisser, Williams A hybrid deep learning approach to vertexing April 3, 2019
  • 19. Compare predictions with targets: When it fails Results 0 50 100 150 200 250 KernelDensity Pred: 65.696 mm Event 2 @ 65.7 mm: False positive Kernel Density 64.00 65.00 66.00 67.00 68.00 z values [mm] 150 100 50 0 50 100 150 xymaximum[m] x y 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Probability Target Predicted False Positive example 0 100 200 300 400 500 KernelDensity True: 51.898 mm Event 3 @ 51.9 mm: PV not found Kernel Density 50.00 51.00 52.00 53.00 54.00 z values [mm] 150 100 50 0 50 100 150 xymaximum[m] x y 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Probability Target Predicted Masked PV not found example 13/16Fang, Schreiner, Sokoloff, Weisser, Williams A hybrid deep learning approach to vertexing April 3, 2019
  • 20. Future addition: xy information Future plans Adding xy information • Point of maximum z in xy available • Extra information: sharp discontinuities between PVs • Need iterative approach or “reduced importance” What about a full 2D kernel? • Not needed for LHCb currently (large xy, “low” z overlap) • Might be useful for other detectors! 0 500 1000 1500 2000 KernelDensity True: 114.622 mm Pred: 114.597 mm : -26 µm Event 2 @ 114.6 mm: PV found Kernel Density 113.00 114.00 115.00 116.00 117.00 z values [mm] 150 100 50 0 50 100 150 xymaximum[m] x y 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Probability Target Predicted 14/16Fang, Schreiner, Sokoloff, Weisser, Williams A hybrid deep learning approach to vertexing April 3, 2019
  • 21. Conclusions and plans Future plans 0 5 10 15 20 25 30 35 40 45 50 55 60 # LHCb long tracks 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 Efficiency • Proof-of-Principle established: a hybrid ML algorithm using a 1-dimensional KDE processed by a 5-layer CNN finds primary vertices with efficiencies and false positive rates similar to traditional algorithms. • Efficiency is tunable; increasing the efficiency also increases the false positive rate. • Adding information should improve performance. • can add KDE (x,y) information to algorithm • can associate tracks to PV candidates, then iterate. • Next steps: train with full LHCb MC and deploy inference engine in LHCb Hlt1 framework. • Beyond LHCb • approach might work for ATLAS and CMS (in 2D?); • algorithm is an interesting ML laboratory. 15/16Fang, Schreiner, Sokoloff, Weisser, Williams A hybrid deep learning approach to vertexing April 3, 2019
  • 22. Final words Future plans Source code: • https://gitlab.cern.ch/LHCb-Reco-Dev/pv-finder • Runnable with Conda on macOS and Linux Run: conda env create -f environment-gpu.yml Python 3.6+ and PyTorch used for machine learning code Generation now available too using the new Conda-Forge ROOT and Pythia8 packages Supported by: • NSF OAC-1836650: IRIS-HEP • NSF OAC-1740102: SI2:SSE • NSF OAC-1739772: SI2:SSE 16/16Fang, Schreiner, Sokoloff, Weisser, Williams A hybrid deep learning approach to vertexing April 3, 2019
  • 23. Final words Future plans Questions? Source code: • https://gitlab.cern.ch/LHCb-Reco-Dev/pv-finder • Runnable with Conda on macOS and Linux Run: conda env create -f environment-gpu.yml Python 3.6+ and PyTorch used for machine learning code Generation now available too using the new Conda-Forge ROOT and Pythia8 packages Supported by: • NSF OAC-1836650: IRIS-HEP • NSF OAC-1740102: SI2:SSE • NSF OAC-1739772: SI2:SSE 16/16Fang, Schreiner, Sokoloff, Weisser, Williams A hybrid deep learning approach to vertexing April 3, 2019
  • 24. More predictions with targets (1) Backup 0 50 100 150 200 KernelDensity True: 221.595 mm Pred: 221.546 mm : -49 µm Event 5 @ 221.5 mm: PV found Kernel Density 219.00 220.00 221.00 222.00 223.00 224.00 z values [mm] 150 100 50 0 50 100 150 xymaximum[m] x y 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Probability Target Predicted Masked 0 500 1000 1500 2000 KernelDensity True: 114.622 mm Pred: 114.597 mm : -26 µm Event 2 @ 114.6 mm: PV found Kernel Density 113.00 114.00 115.00 116.00 117.00 z values [mm] 150 100 50 0 50 100 150 xymaximum[m] x y 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Probability Target Predicted 17/16Fang, Schreiner, Sokoloff, Weisser, Williams A hybrid deep learning approach to vertexing April 3, 2019
  • 25. More predictions with targets (2) Backup 0 200 400 600 800 1000 1200 1400 1600 KernelDensity True: 129.336 mm Pred: 129.337 mm : 1 µm Event 6 @ 129.3 mm: PV found Kernel Density 127.00 128.00 129.00 130.00 131.00 z values [mm] 150 100 50 0 50 100 150 xymaximum[m] x y 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Probability Target Predicted Masked 0 500 1000 1500 2000 KernelDensity True: 143.224 mm Pred: 143.199 mm : -25 µm Event 6 @ 143.2 mm: PV found Kernel Density 141.00 142.00 143.00 144.00 145.00 z values [mm] 150 100 50 0 50 100 150 xymaximum[m] x y 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Probability Target Predicted Masked 18/16Fang, Schreiner, Sokoloff, Weisser, Williams A hybrid deep learning approach to vertexing April 3, 2019
  • 26. More predictions with targets (3) Backup 0 50 100 150 200 250 300 350 400 KernelDensity True: 150.650 mm Pred: 150.416 mm : -234 µm Event 6 @ 150.4 mm: PV found Kernel Density 148.00 149.00 150.00 151.00 152.00 z values [mm] 150 100 50 0 50 100 150 xymaximum[m] x y 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Probability Target Predicted Masked 0 500 1000 1500 2000 2500 KernelDensity True: 179.560 mm Pred: 179.591 mm : 31 µm Event 6 @ 179.6 mm: PV found Kernel Density 178.00 179.00 180.00 181.00 182.00 z values [mm] 150 100 50 0 50 100 150 xymaximum[m] x y 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Probability Target Predicted Masked 19/16Fang, Schreiner, Sokoloff, Weisser, Williams A hybrid deep learning approach to vertexing April 3, 2019
  • 27. The VELO Backup Tracks • Originate from vertices (not shown) • Hits originate from tracks • We only know the true track in simulation • Nearly straight, but tracks may scatter in material The VELO • A set of 26 planes that detect tracks • Tracks should hit one or more pixels per plane • Sparse 3D dataset (41M pixels) 20/16Fang, Schreiner, Sokoloff, Weisser, Williams A hybrid deep learning approach to vertexing April 3, 2019
  • 28. Questions for other experiments Backup • Beam width (x, y): 40 µm for LHCb, what is yours? • Transverse resolution: 5–15 µm for LHCb depending on number of tracks, what is yours? • Longitudinal resolution: 40–100 µm for LHCb depending on number of tracks, what is yours? • Cleaning up prototracks based on IP could simplify kernel • Can prototracking be done in the triggers? 21/16Fang, Schreiner, Sokoloff, Weisser, Williams A hybrid deep learning approach to vertexing April 3, 2019