SlideShare a Scribd company logo
ML Methods for Online Tracking
Alex Gekow
08 Feb 2024
1
Online Vs. Offline Tracking
Offline Tracking
● ∞ time to run algorithms
● Huge amount of available cpu
● Highly specialized precision
algorithms
2
Online Tracking
● L1 Latency constraint
● Limited budget for hardware
● Balance tracking precision
with computational cost/speed
EF Tracking Goal: to run tracking at trigger level in 𝝁=200 pileup conditions
Tracking
Performance
Computational
Performance
Why Machine Learning?
Machine learning algorithms and the hardware and software required to deploy them is a rapidly
expanding domain⸺we can utilize, learn from, and contribute to this development (e.g Tesla FSD, Apple
neural engine, Google tensor…)
3
Why Machine Learning?
Neural networks have proven to be a powerful and versatile tool over a wide range of problems
They Excel at exploiting correlations between input parameters to produce a non-linear mapping f: ℝinput
⟶ ℝouptut
Adapting offline algorithms for new hardware has proven difficult. Why not try to develop ML algorithms for
tracking to leverage newer, faster hardware?
4
Start Simple - Fake Track Classifier
Classification is the most common task for Neural Networks
Train a NN to classify track candidates as True/Fake
The output of the NN is highly dependent on the definition of fake “tracks” which are in turn highly
dependent on the algorithm used to generate the fake tracks
5
● Hough transform is another inexpensive & fast algorithm being studied
in parallel
○ Comes at the cost of a large number of fake hit combinations
● The algorithm requires a fake removal step
● Offline tracking does this via 𝝌2
calculation
○ Can we approximate this figure of merit (or come up with
another) using a NN?
Problem statement in ML language:
“Classify a sequence of hits as true or fake given the 3D coordinates of the
candidate tracks hits”
ML Classifier = fast figure of merit generator
6
Hough Transform Filter
Hough Transform Filter
Classification is the most common application of ML! Operating on HT output offers a perfect environment to test
our hypothesis
1. Pre-processing
a. Rotate all proto-tracks to initially lie along 𝜙=0
b. Scale each hit x/y/z coordinate to be O(1)
2. Score each proto-track with NN Classifier
3. Overlap removal via hit warrior
a. Compare proto-tracks with more than X shared hits
b. Keep only the highest scored proto-track
Reduces the number of fake tracks by a factor two orders of magnitude
while retaining a high purity of true track candidates
*Similar in principle to scoring of conformal map here, but using all hits instead of only three
https://indico.cern.ch/event/1002734/contributions/4231250/attachments/2192619/3706144/CommodityTF_210218.pdf
7
Path Finder
Step up complexity: Can we use ML to find proto-tracks?
8
Assume spacepoint formation of hits and seeds of three hits in the inner-most pixel layers are available
upstream for the pattern recognition algorithm
1. Input 3 hits into a NN
2. Predict the coordinate of the 4th hit
3. Look for hits in the detector nearby the predicted location
4. Append all compatible hits to the seed
5. Repeat until the edge of the detector is reached or no compatible hits are found
Parallelized Predictions
Without much external information (i.e magnetic field, detector geometry…) during run time, we can get
simultaneous predictions for O(100-1000) proto-tracks at a time
9
The un-reliance of external information being bussed to FPGA saves time and memory :)
Performance Metrics
Our goal is to produce sufficiently few proto-tracks for overlap/fake removal while retaining high
efficiency
1. Study the residuals between predicted and true hits to
minimize the search window as a function of (r,η)
2. Count the number of fake proto-tracks generated per event
a. Most will be removed by hit warrior
3. Tune the fake track classifier threshold cut
At this stage we are NOT interested in precision. Simply
constrain the number of found proto-tracks in order to remain
within latency budget
Precision fit will come afterwards
10
For all tracks within |η|<0.8 |z0
| < 150mm |d0
| < 2mm.
Most tracks are true duplicates due to multiple hits in
SCT layers.
Barrel Only Application
11
The algorithm was tested and validated in the barrel of ITK
Efficiency defined as : NMatched
/ NConstructed
where a “track” is considered matched iff there exists a constructed
proto-track such that more than half of the hits after the seed come from a unique particle
Comparable performance to ODD geometry https://arxiv.org/abs/2212.02348
Prediction Residuals, NOT track resolution!
Residuals to be compared to green band in CKF
analogy
Limited/worse precision is OK so long as the gain in
speed outweighs the number of found proto-tracks
Overview of Algorithms
12
The irregular orientation of detector layers of ITk make straightforward coordinate prediction difficult for a NN
If spacepoints and the detector layer are given as input,
the NN learns to associate discrete sets of coordinates to
each detector layer
https://atlas.web.cern.ch/Atlas/GROUPS/PHYSICS/PLOTS/ITK-2020-002/
NN
Classifier
Xt-2
Yt-2
Zt-2
Xt-1
Yt-1
Zt-1
Xt
Yt
Zt
Volume
ID
Layer ID
NN Hit
Predictor
Xt+1
Yt+1
Zt+1
Full Detector Predictions
13
● Target Hits
● Predicted Hits
Z (normalized coordinates)
⍴
(normalized
coordinates)
Y
[mm]
Predicted
True
X [mm]
0
-200
-400
-600
True Seed
Endcap predictions overlaid with target hit coordinates
Found track in the barrel of ITk like geometry
Improvements to ML Models
Improving the predictive power of the NN improves efficiency and reduces the fake rate. Several methods
under investigation
1. Metric learning for layer/volume encodings
a. Exploit relationships between neighboring regions of the detector
2. Recurrent models
a. Exploit the sequential nature of our data
3. Fine tuned loss functions
a. Mean-squared-error assumes constant uncertainty. Remove this assumption for better results
(more difficult to train)
14
Track Parameter Estimation
15
Seeding Track Finding Fake
Removal
Linear Fit Precision
Track Fit
Potential EF-Tracking Pipline:
…
Hough Transform
GNN
ML Path Finding
Hit Condensation
NN Fake Track Filter
Parameter
Estimation to provide
initial guess for
Kalman Filter
Is the linear fit redundant?
Can the NN used for fake track removal also provide track parameter estimates?
16
Track Parameter Estimation
Warning! Not realistic results. Proof of principle only
● Initial non-rigorous testing shows promise
in NN capability to estimate track
parameters
● NN = Linear fit under certain conditions
● Can it outperform linear fit? Does it need
to?
1. Currently generating and studying optimal
training data
2. Interface to precision fitting to determine
the effects of estimates used as initial
guess
Extremely Preliminary!
Bug in z0
being fixed
FPGA Implementation
We have been using relatively small networks (2-3 hidden layers, 32-64 nodes)
Execution on FPGA takes only 50 ns (10 clock cycles) and is perfectly pipelined
To make N predictions, we require N+10 clock cycles
17
Clock
Output
Perfectly
pipelined
Input
FPGA Implementation
NN Models should be compact so as to easily fit on FPGA
Sharing resources with other algorithms also running on FPGA
Firmware and test vectors actively being developed!
18
Latency
(ns)
LUT (%) FF (%) BRAM/URAM (%) DSP (%)
Ambiguity
Resolution
50 18 1 <0.01 31
Hit Prediction 50 7 0.5 <0.01 21
Xilinx Alveo U250 FPGA resource usage estimates for neural networks
* rough estimates as NN architecture may change
Hit prediction only includes coordinate hit prediction and not layer classification
Classification network is smaller than hit prediction network
Conclusion
● Online tracking involves re-evaluating particle tracking problems under light of new constraints
(latency, throughput, hardware…)
● Versatility makes for a large space of potential solutions
○ Full ML tracking (GNN)
○ ML as a tool in a larger toolkit (Fake track filter, pattern recognition, parameter estimation)
● Smaller scale ML approaches can be very flexible and fit readily on FPGA/GPU
● Need to interface new algorithms with existing “traditional” algorithms
■ Core software and firmware development ongoing by EF-Tracking team
19
+ = (?)

More Related Content

PDF
Machine_Learning_Blocks___Bryan_Thesis
PPTX
Trackster Pruning at the CMS High-Granularity Calorimeter
PDF
Train++: An Incremental ML Model Training Algorithm to Create Self Learning I...
PPTX
Detection&Tracking - Thermal imaging object detection and tracking
PPT
TargetTracking[1].ppt random finite set presentation
PDF
IoT with Azure Machine Learning and InfluxDB
PDF
The Perceptron (D1L2 Deep Learning for Speech and Language)
PDF
Machine learning in scientific workflows
Machine_Learning_Blocks___Bryan_Thesis
Trackster Pruning at the CMS High-Granularity Calorimeter
Train++: An Incremental ML Model Training Algorithm to Create Self Learning I...
Detection&Tracking - Thermal imaging object detection and tracking
TargetTracking[1].ppt random finite set presentation
IoT with Azure Machine Learning and InfluxDB
The Perceptron (D1L2 Deep Learning for Speech and Language)
Machine learning in scientific workflows

Similar to The basic ML architecture for all modern PCs and game consoles is similar (20)

PDF
Deep Learning, Microsoft Cognitive Toolkit (CNTK) and Azure Machine Learning ...
PDF
Considerations for Abstracting Complexities of a Real-Time ML Platform, Zhenz...
PPT
Presentacion limac-unc
PDF
Machine Learning Basics for Web Application Developers
PDF
_AI_Stanford_Super_#DeepLearning_Cheat_Sheet!_😊🙃😀🙃😊.pdf
PDF
super-cheatsheet-deep-learning.pdf
PDF
Machine learning for IoT - unpacking the blackbox
PPTX
Internet of things - 3/4. Solving the problems
PDF
Geometric Processing of Data in Neural Networks
PDF
Pushing Intelligence to Edge Nodes : Low Power circuits for Self Localization...
PPTX
230208 MLOps Getting from Good to Great.pptx
PDF
10 more lessons learned from building Machine Learning systems - MLConf
PDF
Xavier Amatriain, VP of Engineering, Quora at MLconf SF - 11/13/15
PDF
10 more lessons learned from building Machine Learning systems
PDF
QuTrack: Model Life Cycle Management for AI and ML models using a Blockchain ...
PDF
KurtPortelliMastersDissertation
PDF
⭐⭐⭐⭐⭐ Device Free Indoor Localization in the 28 GHz band based on machine lea...
PPT
"An adaptive modular approach to the mining of sensor network ...
PPTX
Data science essentials in azure ml
PDF
Multi Object Tracking Methods Based on Particle Filter and HMM
Deep Learning, Microsoft Cognitive Toolkit (CNTK) and Azure Machine Learning ...
Considerations for Abstracting Complexities of a Real-Time ML Platform, Zhenz...
Presentacion limac-unc
Machine Learning Basics for Web Application Developers
_AI_Stanford_Super_#DeepLearning_Cheat_Sheet!_😊🙃😀🙃😊.pdf
super-cheatsheet-deep-learning.pdf
Machine learning for IoT - unpacking the blackbox
Internet of things - 3/4. Solving the problems
Geometric Processing of Data in Neural Networks
Pushing Intelligence to Edge Nodes : Low Power circuits for Self Localization...
230208 MLOps Getting from Good to Great.pptx
10 more lessons learned from building Machine Learning systems - MLConf
Xavier Amatriain, VP of Engineering, Quora at MLconf SF - 11/13/15
10 more lessons learned from building Machine Learning systems
QuTrack: Model Life Cycle Management for AI and ML models using a Blockchain ...
KurtPortelliMastersDissertation
⭐⭐⭐⭐⭐ Device Free Indoor Localization in the 28 GHz band based on machine lea...
"An adaptive modular approach to the mining of sensor network ...
Data science essentials in azure ml
Multi Object Tracking Methods Based on Particle Filter and HMM
Ad

Recently uploaded (20)

PDF
. Radiology Case Scenariosssssssssssssss
PPTX
Derivatives of integument scales, beaks, horns,.pptx
PPTX
Classification Systems_TAXONOMY_SCIENCE8.pptx
PDF
IFIT3 RNA-binding activity primores influenza A viruz infection and translati...
PDF
The scientific heritage No 166 (166) (2025)
PDF
HPLC-PPT.docx high performance liquid chromatography
PPTX
Comparative Structure of Integument in Vertebrates.pptx
PPTX
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
PPTX
famous lake in india and its disturibution and importance
PPTX
Taita Taveta Laboratory Technician Workshop Presentation.pptx
PDF
Formation of Supersonic Turbulence in the Primordial Star-forming Cloud
PDF
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
DOCX
Viruses (History, structure and composition, classification, Bacteriophage Re...
PPTX
INTRODUCTION TO EVS | Concept of sustainability
PPTX
TOTAL hIP ARTHROPLASTY Presentation.pptx
PPTX
2. Earth - The Living Planet earth and life
PDF
Phytochemical Investigation of Miliusa longipes.pdf
PPTX
neck nodes and dissection types and lymph nodes levels
PDF
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
PPT
POSITIONING IN OPERATION THEATRE ROOM.ppt
. Radiology Case Scenariosssssssssssssss
Derivatives of integument scales, beaks, horns,.pptx
Classification Systems_TAXONOMY_SCIENCE8.pptx
IFIT3 RNA-binding activity primores influenza A viruz infection and translati...
The scientific heritage No 166 (166) (2025)
HPLC-PPT.docx high performance liquid chromatography
Comparative Structure of Integument in Vertebrates.pptx
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
famous lake in india and its disturibution and importance
Taita Taveta Laboratory Technician Workshop Presentation.pptx
Formation of Supersonic Turbulence in the Primordial Star-forming Cloud
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
Viruses (History, structure and composition, classification, Bacteriophage Re...
INTRODUCTION TO EVS | Concept of sustainability
TOTAL hIP ARTHROPLASTY Presentation.pptx
2. Earth - The Living Planet earth and life
Phytochemical Investigation of Miliusa longipes.pdf
neck nodes and dissection types and lymph nodes levels
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
POSITIONING IN OPERATION THEATRE ROOM.ppt
Ad

The basic ML architecture for all modern PCs and game consoles is similar

  • 1. ML Methods for Online Tracking Alex Gekow 08 Feb 2024 1
  • 2. Online Vs. Offline Tracking Offline Tracking ● ∞ time to run algorithms ● Huge amount of available cpu ● Highly specialized precision algorithms 2 Online Tracking ● L1 Latency constraint ● Limited budget for hardware ● Balance tracking precision with computational cost/speed EF Tracking Goal: to run tracking at trigger level in 𝝁=200 pileup conditions Tracking Performance Computational Performance
  • 3. Why Machine Learning? Machine learning algorithms and the hardware and software required to deploy them is a rapidly expanding domain⸺we can utilize, learn from, and contribute to this development (e.g Tesla FSD, Apple neural engine, Google tensor…) 3
  • 4. Why Machine Learning? Neural networks have proven to be a powerful and versatile tool over a wide range of problems They Excel at exploiting correlations between input parameters to produce a non-linear mapping f: ℝinput ⟶ ℝouptut Adapting offline algorithms for new hardware has proven difficult. Why not try to develop ML algorithms for tracking to leverage newer, faster hardware? 4
  • 5. Start Simple - Fake Track Classifier Classification is the most common task for Neural Networks Train a NN to classify track candidates as True/Fake The output of the NN is highly dependent on the definition of fake “tracks” which are in turn highly dependent on the algorithm used to generate the fake tracks 5
  • 6. ● Hough transform is another inexpensive & fast algorithm being studied in parallel ○ Comes at the cost of a large number of fake hit combinations ● The algorithm requires a fake removal step ● Offline tracking does this via 𝝌2 calculation ○ Can we approximate this figure of merit (or come up with another) using a NN? Problem statement in ML language: “Classify a sequence of hits as true or fake given the 3D coordinates of the candidate tracks hits” ML Classifier = fast figure of merit generator 6 Hough Transform Filter
  • 7. Hough Transform Filter Classification is the most common application of ML! Operating on HT output offers a perfect environment to test our hypothesis 1. Pre-processing a. Rotate all proto-tracks to initially lie along 𝜙=0 b. Scale each hit x/y/z coordinate to be O(1) 2. Score each proto-track with NN Classifier 3. Overlap removal via hit warrior a. Compare proto-tracks with more than X shared hits b. Keep only the highest scored proto-track Reduces the number of fake tracks by a factor two orders of magnitude while retaining a high purity of true track candidates *Similar in principle to scoring of conformal map here, but using all hits instead of only three https://indico.cern.ch/event/1002734/contributions/4231250/attachments/2192619/3706144/CommodityTF_210218.pdf 7
  • 8. Path Finder Step up complexity: Can we use ML to find proto-tracks? 8 Assume spacepoint formation of hits and seeds of three hits in the inner-most pixel layers are available upstream for the pattern recognition algorithm 1. Input 3 hits into a NN 2. Predict the coordinate of the 4th hit 3. Look for hits in the detector nearby the predicted location 4. Append all compatible hits to the seed 5. Repeat until the edge of the detector is reached or no compatible hits are found
  • 9. Parallelized Predictions Without much external information (i.e magnetic field, detector geometry…) during run time, we can get simultaneous predictions for O(100-1000) proto-tracks at a time 9 The un-reliance of external information being bussed to FPGA saves time and memory :)
  • 10. Performance Metrics Our goal is to produce sufficiently few proto-tracks for overlap/fake removal while retaining high efficiency 1. Study the residuals between predicted and true hits to minimize the search window as a function of (r,η) 2. Count the number of fake proto-tracks generated per event a. Most will be removed by hit warrior 3. Tune the fake track classifier threshold cut At this stage we are NOT interested in precision. Simply constrain the number of found proto-tracks in order to remain within latency budget Precision fit will come afterwards 10 For all tracks within |η|<0.8 |z0 | < 150mm |d0 | < 2mm. Most tracks are true duplicates due to multiple hits in SCT layers.
  • 11. Barrel Only Application 11 The algorithm was tested and validated in the barrel of ITK Efficiency defined as : NMatched / NConstructed where a “track” is considered matched iff there exists a constructed proto-track such that more than half of the hits after the seed come from a unique particle Comparable performance to ODD geometry https://arxiv.org/abs/2212.02348 Prediction Residuals, NOT track resolution! Residuals to be compared to green band in CKF analogy Limited/worse precision is OK so long as the gain in speed outweighs the number of found proto-tracks
  • 12. Overview of Algorithms 12 The irregular orientation of detector layers of ITk make straightforward coordinate prediction difficult for a NN If spacepoints and the detector layer are given as input, the NN learns to associate discrete sets of coordinates to each detector layer https://atlas.web.cern.ch/Atlas/GROUPS/PHYSICS/PLOTS/ITK-2020-002/ NN Classifier Xt-2 Yt-2 Zt-2 Xt-1 Yt-1 Zt-1 Xt Yt Zt Volume ID Layer ID NN Hit Predictor Xt+1 Yt+1 Zt+1
  • 13. Full Detector Predictions 13 ● Target Hits ● Predicted Hits Z (normalized coordinates) ⍴ (normalized coordinates) Y [mm] Predicted True X [mm] 0 -200 -400 -600 True Seed Endcap predictions overlaid with target hit coordinates Found track in the barrel of ITk like geometry
  • 14. Improvements to ML Models Improving the predictive power of the NN improves efficiency and reduces the fake rate. Several methods under investigation 1. Metric learning for layer/volume encodings a. Exploit relationships between neighboring regions of the detector 2. Recurrent models a. Exploit the sequential nature of our data 3. Fine tuned loss functions a. Mean-squared-error assumes constant uncertainty. Remove this assumption for better results (more difficult to train) 14
  • 15. Track Parameter Estimation 15 Seeding Track Finding Fake Removal Linear Fit Precision Track Fit Potential EF-Tracking Pipline: … Hough Transform GNN ML Path Finding Hit Condensation NN Fake Track Filter Parameter Estimation to provide initial guess for Kalman Filter Is the linear fit redundant? Can the NN used for fake track removal also provide track parameter estimates?
  • 16. 16 Track Parameter Estimation Warning! Not realistic results. Proof of principle only ● Initial non-rigorous testing shows promise in NN capability to estimate track parameters ● NN = Linear fit under certain conditions ● Can it outperform linear fit? Does it need to? 1. Currently generating and studying optimal training data 2. Interface to precision fitting to determine the effects of estimates used as initial guess Extremely Preliminary! Bug in z0 being fixed
  • 17. FPGA Implementation We have been using relatively small networks (2-3 hidden layers, 32-64 nodes) Execution on FPGA takes only 50 ns (10 clock cycles) and is perfectly pipelined To make N predictions, we require N+10 clock cycles 17 Clock Output Perfectly pipelined Input
  • 18. FPGA Implementation NN Models should be compact so as to easily fit on FPGA Sharing resources with other algorithms also running on FPGA Firmware and test vectors actively being developed! 18 Latency (ns) LUT (%) FF (%) BRAM/URAM (%) DSP (%) Ambiguity Resolution 50 18 1 <0.01 31 Hit Prediction 50 7 0.5 <0.01 21 Xilinx Alveo U250 FPGA resource usage estimates for neural networks * rough estimates as NN architecture may change Hit prediction only includes coordinate hit prediction and not layer classification Classification network is smaller than hit prediction network
  • 19. Conclusion ● Online tracking involves re-evaluating particle tracking problems under light of new constraints (latency, throughput, hardware…) ● Versatility makes for a large space of potential solutions ○ Full ML tracking (GNN) ○ ML as a tool in a larger toolkit (Fake track filter, pattern recognition, parameter estimation) ● Smaller scale ML approaches can be very flexible and fit readily on FPGA/GPU ● Need to interface new algorithms with existing “traditional” algorithms ■ Core software and firmware development ongoing by EF-Tracking team 19 + = (?)