The basic ML architecture for all modern PCs and game consoles is similar

ML Methods for Online Tracking
Alex Gekow
08 Feb 2024
1

Online Vs. Offline Tracking
Offline Tracking
● ∞ time to run algorithms
● Huge amount of available cpu
● Highly specialized precision
algorithms
2
Online Tracking
● L1 Latency constraint
● Limited budget for hardware
● Balance tracking precision
with computational cost/speed
EF Tracking Goal: to run tracking at trigger level in 𝝁=200 pileup conditions
Tracking
Performance
Computational
Performance

Why Machine Learning?
Machine learning algorithms and the hardware and software required to deploy them is a rapidly
expanding domain⸺we can utilize, learn from, and contribute to this development (e.g Tesla FSD, Apple
neural engine, Google tensor…)
3

Why Machine Learning?
Neural networks have proven to be a powerful and versatile tool over a wide range of problems
They Excel at exploiting correlations between input parameters to produce a non-linear mapping f: ℝinput
⟶ ℝouptut
Adapting offline algorithms for new hardware has proven difficult. Why not try to develop ML algorithms for
tracking to leverage newer, faster hardware?
4

Start Simple - Fake Track Classiﬁer
Classification is the most common task for Neural Networks
Train a NN to classify track candidates as True/Fake
The output of the NN is highly dependent on the definition of fake “tracks” which are in turn highly
dependent on the algorithm used to generate the fake tracks
5

● Hough transform is another inexpensive & fast algorithm being studied
in parallel
○ Comes at the cost of a large number of fake hit combinations
● The algorithm requires a fake removal step
● Offline tracking does this via 𝝌2
calculation
○ Can we approximate this figure of merit (or come up with
another) using a NN?
Problem statement in ML language:
“Classify a sequence of hits as true or fake given the 3D coordinates of the
candidate tracks hits”
ML Classifier = fast figure of merit generator
6
Hough Transform Filter

Hough Transform Filter
Classification is the most common application of ML! Operating on HT output offers a perfect environment to test
our hypothesis
1. Pre-processing
a. Rotate all proto-tracks to initially lie along 𝜙=0
b. Scale each hit x/y/z coordinate to be O(1)
2. Score each proto-track with NN Classifier
3. Overlap removal via hit warrior
a. Compare proto-tracks with more than X shared hits
b. Keep only the highest scored proto-track
Reduces the number of fake tracks by a factor two orders of magnitude
while retaining a high purity of true track candidates
*Similar in principle to scoring of conformal map here, but using all hits instead of only three
https://indico.cern.ch/event/1002734/contributions/4231250/attachments/2192619/3706144/CommodityTF_210218.pdf
7

Path Finder
Step up complexity: Can we use ML to find proto-tracks?
8
Assume spacepoint formation of hits and seeds of three hits in the inner-most pixel layers are available
upstream for the pattern recognition algorithm
1. Input 3 hits into a NN
2. Predict the coordinate of the 4th hit
3. Look for hits in the detector nearby the predicted location
4. Append all compatible hits to the seed
5. Repeat until the edge of the detector is reached or no compatible hits are found

Parallelized Predictions
Without much external information (i.e magnetic field, detector geometry…) during run time, we can get
simultaneous predictions for O(100-1000) proto-tracks at a time
9
The un-reliance of external information being bussed to FPGA saves time and memory :)

Performance Metrics
Our goal is to produce sufficiently few proto-tracks for overlap/fake removal while retaining high
efficiency
1. Study the residuals between predicted and true hits to
minimize the search window as a function of (r,η)
2. Count the number of fake proto-tracks generated per event
a. Most will be removed by hit warrior
3. Tune the fake track classifier threshold cut
At this stage we are NOT interested in precision. Simply
constrain the number of found proto-tracks in order to remain
within latency budget
Precision fit will come afterwards
10
For all tracks within |η|<0.8 |z0
| < 150mm |d0
| < 2mm.
Most tracks are true duplicates due to multiple hits in
SCT layers.

Barrel Only Application
11
The algorithm was tested and validated in the barrel of ITK
Efficiency defined as : NMatched
/ NConstructed
where a “track” is considered matched iff there exists a constructed
proto-track such that more than half of the hits after the seed come from a unique particle
Comparable performance to ODD geometry https://arxiv.org/abs/2212.02348
Prediction Residuals, NOT track resolution!
Residuals to be compared to green band in CKF
analogy
Limited/worse precision is OK so long as the gain in
speed outweighs the number of found proto-tracks

Overview of Algorithms
12
The irregular orientation of detector layers of ITk make straightforward coordinate prediction difficult for a NN
If spacepoints and the detector layer are given as input,
the NN learns to associate discrete sets of coordinates to
each detector layer
https://atlas.web.cern.ch/Atlas/GROUPS/PHYSICS/PLOTS/ITK-2020-002/
NN
Classifier
Xt-2
Yt-2
Zt-2
Xt-1
Yt-1
Zt-1
Xt
Yt
Zt
Volume
ID
Layer ID
NN Hit
Predictor
Xt+1
Yt+1
Zt+1

Full Detector Predictions
13
● Target Hits
● Predicted Hits
Z (normalized coordinates)
⍴
(normalized
coordinates)
Y
[mm]
Predicted
True
X [mm]
0
-200
-400
-600
True Seed
Endcap predictions overlaid with target hit coordinates
Found track in the barrel of ITk like geometry

Improvements to ML Models
Improving the predictive power of the NN improves efficiency and reduces the fake rate. Several methods
under investigation
1. Metric learning for layer/volume encodings
a. Exploit relationships between neighboring regions of the detector
2. Recurrent models
a. Exploit the sequential nature of our data
3. Fine tuned loss functions
a. Mean-squared-error assumes constant uncertainty. Remove this assumption for better results
(more difficult to train)
14

Track Parameter Estimation
15
Seeding Track Finding Fake
Removal
Linear Fit Precision
Track Fit
Potential EF-Tracking Pipline:
…
Hough Transform
GNN
ML Path Finding
Hit Condensation
NN Fake Track Filter
Parameter
Estimation to provide
initial guess for
Kalman Filter
Is the linear fit redundant?
Can the NN used for fake track removal also provide track parameter estimates?

16
Track Parameter Estimation
Warning! Not realistic results. Proof of principle only
● Initial non-rigorous testing shows promise
in NN capability to estimate track
parameters
● NN = Linear fit under certain conditions
● Can it outperform linear fit? Does it need
to?
1. Currently generating and studying optimal
training data
2. Interface to precision fitting to determine
the effects of estimates used as initial
guess
Extremely Preliminary!
Bug in z0
being fixed

FPGA Implementation
We have been using relatively small networks (2-3 hidden layers, 32-64 nodes)
Execution on FPGA takes only 50 ns (10 clock cycles) and is perfectly pipelined
To make N predictions, we require N+10 clock cycles
17
Clock
Output
Perfectly
pipelined
Input

FPGA Implementation
NN Models should be compact so as to easily fit on FPGA
Sharing resources with other algorithms also running on FPGA
Firmware and test vectors actively being developed!
18
Latency
(ns)
LUT (%) FF (%) BRAM/URAM (%) DSP (%)
Ambiguity
Resolution
50 18 1 <0.01 31
Hit Prediction 50 7 0.5 <0.01 21
Xilinx Alveo U250 FPGA resource usage estimates for neural networks
* rough estimates as NN architecture may change
Hit prediction only includes coordinate hit prediction and not layer classification
Classification network is smaller than hit prediction network

Conclusion
● Online tracking involves re-evaluating particle tracking problems under light of new constraints
(latency, throughput, hardware…)
● Versatility makes for a large space of potential solutions
○ Full ML tracking (GNN)
○ ML as a tool in a larger toolkit (Fake track filter, pattern recognition, parameter estimation)
● Smaller scale ML approaches can be very flexible and fit readily on FPGA/GPU
● Need to interface new algorithms with existing “traditional” algorithms
■ Core software and firmware development ongoing by EF-Tracking team
19
+ = (?)

The basic ML architecture for all modern PCs and game consoles is similar

More Related Content

Similar to The basic ML architecture for all modern PCs and game consoles is similar (20)

Recently uploaded (20)

The basic ML architecture for all modern PCs and game consoles is similar