SlideShare a Scribd company logo
Saurabh Ghanekar, Kavi Global
Kazutaka Takahashi, University of Chicago
Fiducial Marker Tracking
Using Machine Vision
#AISAIS14
Outline
• Motivation & Goals
• Approach
• Results
• Next Steps
2#AISAIS14
Motivation
• Feeding is a highly complex, life-sustaining behavior, essential for
survival in all species
• Certain neurological conditions such as Parkinson’s disease, ALS,
stroke can cause difficulty in chewing and swallowing, known as
dysphagia
• Affects quality of life
• Dysphagia can lead to malnutrition, dehydration, and aspiration
3#AISAIS14
End-Goal
To characterize feeding dynamics and gain
insights into feeding behavior changes caused by
certain neurological conditions and changes in oral
environment.
4#AISAIS14
Current State
• Study focused on rodents
• X-ROMM videos of rodents feeding on kibble
• Videos recorded from 2 camera angles simultaneously
• Radio-opaque markers implanted in skull, mandible, tongue
• Movement of markers needs to be tracked and quantified
• Marker tracking process is extremely tedious as it is done using
manual, frame-by-frame methods [1,2]
• Consumes valuable time, thus delaying further research
5#AISAIS14
Immediate Goal
6#AISAIS14
A near-automated, deep learning-based solution for detecting and
tracking markers, resulting in a more efficient and robust process
(c) Bunyak et al, 2017
Approach: Key Steps
7#AISAIS14
Head and Marker Detection:
Utilize neural network to identify
bounding box of head and also
pinpoint unlabeled markers inside
the bounding box.
Data In:
Read in videos frame by
frame for left and right
cameras in 2D (x,y)
Marker Tracking:
Employ Kalman filters along with
Hungarian algorithm to keep
track of markers from frame to
frame
Sequence Matching:
Match sequence tracks from
left and right cameras
2D to 3D conversion:
Feed 2D left right coordinates
along with rotational matrices and
translation vector to get final 3D
coordinates (x,y,z)
Data Description
• 13 pairs of videos (left & right camera) available for training
• 720px by 1260px videos, recorded at 250 fps, ~10 seconds each
• Head and marker coordinates per frame used for model training &
evaluation
• 18-20 markers to be tracked in each video
8#AISAIS14
Camera 1 Camera 2
Head and Marker Detection
• TensorFlow’s Object Detector API
• Single Shot Multibox Detector (SSD) with MobileNet using transfer
learning from the MS COCO dataset
• Key model parameters:
– Initial Learning Rate: 0.0004
– Feature Extractor Type: ssd_mobilenet_v1
– Minimum Depth: 16
– Depth Multiplier: 1.0
– conv_hyperparams: activation: RELU_6; regularizer: l2_regularizer; weight: 0.00004
9#AISAIS14
10#AISAIS14
Head and Marker Detection
Marker Tracking
Multi-object tracking involves three key
components:
• Predicting the object location in the next frame
• Associating predictions with existing objects
• Track Management
11#AISAIS14
(c) Howe, Holcombe, 2012
Prediction
• Kalman Filter is used to predict marker location in the next frame
• Estimate position recursively in each frame, based on previous frames
• Uses Bayesian learning and estimates a joint probability distribution
• Start with initial velocity estimate & covariance matrix
12#AISAIS14
Association
• After prediction, an assignment cost-matrix is computed from the
bounding-box intersection-over-union (IoU)
• Hungarian Algorithm is used to optimally associate markers
13#AISAIS14
518 101 312
24 963 225
872 20 220
Predicted
Marker
Positions
True Marker Positions
0
1
2
...
0 1 2 ...
Track Management
• If IoU is below a set threshold, there is no assignment
• Also, not all potential tracks become actual tracks
• As a result, tracks may die and new ones are born
• The output of Kalman filter and Hungarian algorithm can
result in a large number of discontinuous tracks
• These are “stitched” together by looking forward and
backward a number of frames to find the best match based
on closest Euclidean distance
• At the end, we get one track per marker
14#AISAIS14
Sequence Matching
• After generating marker tracks separately for
each camera, corresponding tracks from
each camera must be matched
• Tried different distinct methodologies such as
Time Series Clustering and different
correlation measures.
• Spearman correlation on frame-to-frame
changes in Y-coordinate values gave the best
results (100% accuracy on manually tracked
data)
15#AISAIS14
Predicted (x,y)
For left camera
Predicted (x,y)
For right camera
2D to 3D Conversion
• P = K * (R | T) - Camera Projection Matrix
(3x4) for each camera
– K = Camera Matrix (3x3)
– R = Rotation Matrix (3x3)
– T = Translation Vector (3x1)
• Results are a good match with actual 3D
coordinates
16#AISAIS14
Evaluation 1: % IoU Difference
17#AISAIS14
Step 1: Calculate a perfect overlap.
Sum the area of the boxes over each
frame for each true stream
True
Streams
Area Stream: (box area)*(number of frames)
Step 2: Calculate the percent difference
between the perfect area and actual IoU
Predicted
Streams
True
Streams
% IoU Difference =
Evaluation 2: % Correctly Labeled
18#AISAIS14
Step 2: Calculate the percent of frames
labeled with the same label as the overall
label given in the tracking phase
Step 1: Determine the best matching
marker label for each frame in the
predicted streams using the maximum IOU
in each frame
Predicted
Streams
True
Streams
0
0
2
2
2 1
2
2
2 Predicted Stream:
Marker 2
0
0
2
2
2 1
2
2
2
Results
19#AISAIS14
Challenges
20#AISAIS14
Ideal scenario Off-screen Markers Occluded Markers
Even if marker detection and tracking models perform well, the above
problems may negatively impact results since at some point a marker may
be assigned to the incorrect track
Next Steps
• Detection
– Tune marker detection thresholds, and marker assignment thresholds
• Kalman Filter
– Tune initialization velocities, and acceleration and covariance matrices
– Better initialization is known to produce better predictions
– Non-linear methods (Extended Kalman Filters, particle filters)
• Marker Detection Assignment to Kalman Tracks
– Currently using Hungarian Assignment. Other options include Probabilistic Assignment,
Markov Chain Monte Carlo methods
• Stitching
– Tune parameters and algorithm to better match together disparate tracks
21#AISAIS14
References
[1] Bunyak F, Shiraishi N, Palaniappan K, Lever TE, Avivi-Arber L, Takahashi K. Development of semi-automatic
procedure for detection and tracking of fiducial markers for orofacial kinematics during natural feeding.
Conference proceedings : Annual International Conference of the IEEE Engineering in Medicine and Biology
Society IEEE Engineering in Medicine and Biology Society Annual Conference. 2017;2017:580-583.
doi:10.1109/EMBC.2017.8036891.
[2] Best MD, Nakamura Y, Kijak NA, et al. Semiautomatic marker tracking of tongue positions captured by
videofluoroscopy during primate feeding. Conference proceedings: . Annual International Conference of the IEEE
Engineering in Medicine and Biology Society IEEE Engineering in Medicine and Biology Society Annual
Conference. 2015;2015:5347-5350. doi:10.1109/EMBC.2015.7319599.
[3] Howe PDL and Holcombe AO (2012) The effect of visual distinctiveness on multiple object tracking performance.
Front. Psychology 3:307. doi: 10.3389/fpsyg.2012.00307
22#AISAIS14
Thank You!
23#AISAIS14
Funding Information:
• National Center for Advancing Translational Sciences of the National Institutes of Health (UL1 TR000430)
• JSPS The Strategic Young Researcher Overseas Visits Program for Accelerating Brain Circulation (S2504)
• JSPS KAKENHI (JP16K11589)
Acknowledgements:
• Dr. Naru Shiraishi, Niigata University, Japan (for experimental procedure development and data collection)
• Animal Research Center (ARC) staff at the University of Chicago
Saurabh Ghanekar
Principal Consultant
Kavi Global
saurabh@kaviglobal.com
Kazutaka Takahashi, Ph.D.
Research Assistant Professor
Department of Organismal Biology and Anatomy
University of Chicago
kazutaka@uchicago.edu

More Related Content

PDF
Vot presentation
PPTX
Various object detection and tracking methods
PDF
Thesis_Prakash
PPT
2D/Multi-view Segmentation and Tracking
PDF
Overview Of Video Object Tracking System
PDF
“Using a Collaborative Network of Distributed Cameras for Object Tracking,” a...
PDF
“Multiple Object Tracking Systems,” a Presentation from Tryolabs
PPTX
Object tracking survey
Vot presentation
Various object detection and tracking methods
Thesis_Prakash
2D/Multi-view Segmentation and Tracking
Overview Of Video Object Tracking System
“Using a Collaborative Network of Distributed Cameras for Object Tracking,” a...
“Multiple Object Tracking Systems,” a Presentation from Tryolabs
Object tracking survey

Similar to Fiducial Marker Tracking Using Machine Vision with Saurabh Ghanekar and Kazutaka Takahashi (20)

PDF
Design and implementation of video tracking system based on camera field of view
PPTX
Visual object tracking based on local steering kernals
PDF
Presentation of Visual Tracking
PDF
ess-autonomousnavigation-ijrr10final.pdf
PDF
“Multi-object Tracking Systems,” a Presentation from Tryolabs
PPTX
[20240603_LabSeminar_Huy]TransMOT: Spatial-Temporal Graph Transformer for Mul...
PDF
A Survey on Approaches for Object Tracking
PDF
"What is Neuromorphic Event-based Computer Vision? Sensors, Theory and Applic...
PPTX
REAL-TIME OBJECT DETECTION AND TRACKING.pptx
PPTX
Object tracking
PDF
Robust Tracking Via Feature Mapping Method and Support Vector Machine
PDF
Detection and Tracking of Objects: A Detailed Study
PPTX
Object tracking
PPTX
Particle filter and cam shift approach for motion detection
PPTX
2014 eccv stm
PDF
[DL輪読会]Tracking Objects as Points
PDF
Motion and tracking
PDF
Digital Video Information Extraction And Object Tracking Neves Ajr
PDF
Digital Video Information Extraction And Object Tracking Neves Ajr
PPTX
From Unsupervised to Semi-Supervised Event Detection
Design and implementation of video tracking system based on camera field of view
Visual object tracking based on local steering kernals
Presentation of Visual Tracking
ess-autonomousnavigation-ijrr10final.pdf
“Multi-object Tracking Systems,” a Presentation from Tryolabs
[20240603_LabSeminar_Huy]TransMOT: Spatial-Temporal Graph Transformer for Mul...
A Survey on Approaches for Object Tracking
"What is Neuromorphic Event-based Computer Vision? Sensors, Theory and Applic...
REAL-TIME OBJECT DETECTION AND TRACKING.pptx
Object tracking
Robust Tracking Via Feature Mapping Method and Support Vector Machine
Detection and Tracking of Objects: A Detailed Study
Object tracking
Particle filter and cam shift approach for motion detection
2014 eccv stm
[DL輪読会]Tracking Objects as Points
Motion and tracking
Digital Video Information Extraction And Object Tracking Neves Ajr
Digital Video Information Extraction And Object Tracking Neves Ajr
From Unsupervised to Semi-Supervised Event Detection
Ad

More from Databricks (20)

PPTX
DW Migration Webinar-March 2022.pptx
PPTX
Data Lakehouse Symposium | Day 1 | Part 1
PPT
Data Lakehouse Symposium | Day 1 | Part 2
PPTX
Data Lakehouse Symposium | Day 2
PPTX
Data Lakehouse Symposium | Day 4
PDF
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
PDF
Democratizing Data Quality Through a Centralized Platform
PDF
Learn to Use Databricks for Data Science
PDF
Why APM Is Not the Same As ML Monitoring
PDF
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
PDF
Stage Level Scheduling Improving Big Data and AI Integration
PDF
Simplify Data Conversion from Spark to TensorFlow and PyTorch
PDF
Scaling your Data Pipelines with Apache Spark on Kubernetes
PDF
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
PDF
Sawtooth Windows for Feature Aggregations
PDF
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
PDF
Re-imagine Data Monitoring with whylogs and Spark
PDF
Raven: End-to-end Optimization of ML Prediction Queries
PDF
Processing Large Datasets for ADAS Applications using Apache Spark
PDF
Massive Data Processing in Adobe Using Delta Lake
DW Migration Webinar-March 2022.pptx
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 4
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
Democratizing Data Quality Through a Centralized Platform
Learn to Use Databricks for Data Science
Why APM Is Not the Same As ML Monitoring
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
Stage Level Scheduling Improving Big Data and AI Integration
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Sawtooth Windows for Feature Aggregations
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Re-imagine Data Monitoring with whylogs and Spark
Raven: End-to-end Optimization of ML Prediction Queries
Processing Large Datasets for ADAS Applications using Apache Spark
Massive Data Processing in Adobe Using Delta Lake
Ad

Recently uploaded (20)

PDF
Navigating the Thai Supplements Landscape.pdf
PPTX
Managing Community Partner Relationships
PDF
Optimise Shopper Experiences with a Strong Data Estate.pdf
PPTX
Qualitative Qantitative and Mixed Methods.pptx
PDF
OneRead_20250728_1808.pdfhdhddhshahwhwwjjaaja
PDF
REAL ILLUMINATI AGENT IN KAMPALA UGANDA CALL ON+256765750853/0705037305
PDF
Votre score augmente si vous choisissez une catégorie et que vous rédigez une...
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PPTX
QUANTUM_COMPUTING_AND_ITS_POTENTIAL_APPLICATIONS[2].pptx
PDF
Microsoft 365 products and services descrption
PDF
Jean-Georges Perrin - Spark in Action, Second Edition (2020, Manning Publicat...
PPTX
retention in jsjsksksksnbsndjddjdnFPD.pptx
PDF
Data Engineering Interview Questions & Answers Batch Processing (Spark, Hadoo...
PPT
ISS -ESG Data flows What is ESG and HowHow
PDF
[EN] Industrial Machine Downtime Prediction
PPTX
Topic 5 Presentation 5 Lesson 5 Corporate Fin
PPTX
modul_python (1).pptx for professional and student
PPTX
IMPACT OF LANDSLIDE.....................
PPTX
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
PDF
Introduction to Data Science and Data Analysis
Navigating the Thai Supplements Landscape.pdf
Managing Community Partner Relationships
Optimise Shopper Experiences with a Strong Data Estate.pdf
Qualitative Qantitative and Mixed Methods.pptx
OneRead_20250728_1808.pdfhdhddhshahwhwwjjaaja
REAL ILLUMINATI AGENT IN KAMPALA UGANDA CALL ON+256765750853/0705037305
Votre score augmente si vous choisissez une catégorie et que vous rédigez une...
IBA_Chapter_11_Slides_Final_Accessible.pptx
QUANTUM_COMPUTING_AND_ITS_POTENTIAL_APPLICATIONS[2].pptx
Microsoft 365 products and services descrption
Jean-Georges Perrin - Spark in Action, Second Edition (2020, Manning Publicat...
retention in jsjsksksksnbsndjddjdnFPD.pptx
Data Engineering Interview Questions & Answers Batch Processing (Spark, Hadoo...
ISS -ESG Data flows What is ESG and HowHow
[EN] Industrial Machine Downtime Prediction
Topic 5 Presentation 5 Lesson 5 Corporate Fin
modul_python (1).pptx for professional and student
IMPACT OF LANDSLIDE.....................
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
Introduction to Data Science and Data Analysis

Fiducial Marker Tracking Using Machine Vision with Saurabh Ghanekar and Kazutaka Takahashi

  • 1. Saurabh Ghanekar, Kavi Global Kazutaka Takahashi, University of Chicago Fiducial Marker Tracking Using Machine Vision #AISAIS14
  • 2. Outline • Motivation & Goals • Approach • Results • Next Steps 2#AISAIS14
  • 3. Motivation • Feeding is a highly complex, life-sustaining behavior, essential for survival in all species • Certain neurological conditions such as Parkinson’s disease, ALS, stroke can cause difficulty in chewing and swallowing, known as dysphagia • Affects quality of life • Dysphagia can lead to malnutrition, dehydration, and aspiration 3#AISAIS14
  • 4. End-Goal To characterize feeding dynamics and gain insights into feeding behavior changes caused by certain neurological conditions and changes in oral environment. 4#AISAIS14
  • 5. Current State • Study focused on rodents • X-ROMM videos of rodents feeding on kibble • Videos recorded from 2 camera angles simultaneously • Radio-opaque markers implanted in skull, mandible, tongue • Movement of markers needs to be tracked and quantified • Marker tracking process is extremely tedious as it is done using manual, frame-by-frame methods [1,2] • Consumes valuable time, thus delaying further research 5#AISAIS14
  • 6. Immediate Goal 6#AISAIS14 A near-automated, deep learning-based solution for detecting and tracking markers, resulting in a more efficient and robust process (c) Bunyak et al, 2017
  • 7. Approach: Key Steps 7#AISAIS14 Head and Marker Detection: Utilize neural network to identify bounding box of head and also pinpoint unlabeled markers inside the bounding box. Data In: Read in videos frame by frame for left and right cameras in 2D (x,y) Marker Tracking: Employ Kalman filters along with Hungarian algorithm to keep track of markers from frame to frame Sequence Matching: Match sequence tracks from left and right cameras 2D to 3D conversion: Feed 2D left right coordinates along with rotational matrices and translation vector to get final 3D coordinates (x,y,z)
  • 8. Data Description • 13 pairs of videos (left & right camera) available for training • 720px by 1260px videos, recorded at 250 fps, ~10 seconds each • Head and marker coordinates per frame used for model training & evaluation • 18-20 markers to be tracked in each video 8#AISAIS14 Camera 1 Camera 2
  • 9. Head and Marker Detection • TensorFlow’s Object Detector API • Single Shot Multibox Detector (SSD) with MobileNet using transfer learning from the MS COCO dataset • Key model parameters: – Initial Learning Rate: 0.0004 – Feature Extractor Type: ssd_mobilenet_v1 – Minimum Depth: 16 – Depth Multiplier: 1.0 – conv_hyperparams: activation: RELU_6; regularizer: l2_regularizer; weight: 0.00004 9#AISAIS14
  • 11. Marker Tracking Multi-object tracking involves three key components: • Predicting the object location in the next frame • Associating predictions with existing objects • Track Management 11#AISAIS14 (c) Howe, Holcombe, 2012
  • 12. Prediction • Kalman Filter is used to predict marker location in the next frame • Estimate position recursively in each frame, based on previous frames • Uses Bayesian learning and estimates a joint probability distribution • Start with initial velocity estimate & covariance matrix 12#AISAIS14
  • 13. Association • After prediction, an assignment cost-matrix is computed from the bounding-box intersection-over-union (IoU) • Hungarian Algorithm is used to optimally associate markers 13#AISAIS14 518 101 312 24 963 225 872 20 220 Predicted Marker Positions True Marker Positions 0 1 2 ... 0 1 2 ...
  • 14. Track Management • If IoU is below a set threshold, there is no assignment • Also, not all potential tracks become actual tracks • As a result, tracks may die and new ones are born • The output of Kalman filter and Hungarian algorithm can result in a large number of discontinuous tracks • These are “stitched” together by looking forward and backward a number of frames to find the best match based on closest Euclidean distance • At the end, we get one track per marker 14#AISAIS14
  • 15. Sequence Matching • After generating marker tracks separately for each camera, corresponding tracks from each camera must be matched • Tried different distinct methodologies such as Time Series Clustering and different correlation measures. • Spearman correlation on frame-to-frame changes in Y-coordinate values gave the best results (100% accuracy on manually tracked data) 15#AISAIS14 Predicted (x,y) For left camera Predicted (x,y) For right camera
  • 16. 2D to 3D Conversion • P = K * (R | T) - Camera Projection Matrix (3x4) for each camera – K = Camera Matrix (3x3) – R = Rotation Matrix (3x3) – T = Translation Vector (3x1) • Results are a good match with actual 3D coordinates 16#AISAIS14
  • 17. Evaluation 1: % IoU Difference 17#AISAIS14 Step 1: Calculate a perfect overlap. Sum the area of the boxes over each frame for each true stream True Streams Area Stream: (box area)*(number of frames) Step 2: Calculate the percent difference between the perfect area and actual IoU Predicted Streams True Streams % IoU Difference =
  • 18. Evaluation 2: % Correctly Labeled 18#AISAIS14 Step 2: Calculate the percent of frames labeled with the same label as the overall label given in the tracking phase Step 1: Determine the best matching marker label for each frame in the predicted streams using the maximum IOU in each frame Predicted Streams True Streams 0 0 2 2 2 1 2 2 2 Predicted Stream: Marker 2 0 0 2 2 2 1 2 2 2
  • 20. Challenges 20#AISAIS14 Ideal scenario Off-screen Markers Occluded Markers Even if marker detection and tracking models perform well, the above problems may negatively impact results since at some point a marker may be assigned to the incorrect track
  • 21. Next Steps • Detection – Tune marker detection thresholds, and marker assignment thresholds • Kalman Filter – Tune initialization velocities, and acceleration and covariance matrices – Better initialization is known to produce better predictions – Non-linear methods (Extended Kalman Filters, particle filters) • Marker Detection Assignment to Kalman Tracks – Currently using Hungarian Assignment. Other options include Probabilistic Assignment, Markov Chain Monte Carlo methods • Stitching – Tune parameters and algorithm to better match together disparate tracks 21#AISAIS14
  • 22. References [1] Bunyak F, Shiraishi N, Palaniappan K, Lever TE, Avivi-Arber L, Takahashi K. Development of semi-automatic procedure for detection and tracking of fiducial markers for orofacial kinematics during natural feeding. Conference proceedings : Annual International Conference of the IEEE Engineering in Medicine and Biology Society IEEE Engineering in Medicine and Biology Society Annual Conference. 2017;2017:580-583. doi:10.1109/EMBC.2017.8036891. [2] Best MD, Nakamura Y, Kijak NA, et al. Semiautomatic marker tracking of tongue positions captured by videofluoroscopy during primate feeding. Conference proceedings: . Annual International Conference of the IEEE Engineering in Medicine and Biology Society IEEE Engineering in Medicine and Biology Society Annual Conference. 2015;2015:5347-5350. doi:10.1109/EMBC.2015.7319599. [3] Howe PDL and Holcombe AO (2012) The effect of visual distinctiveness on multiple object tracking performance. Front. Psychology 3:307. doi: 10.3389/fpsyg.2012.00307 22#AISAIS14
  • 23. Thank You! 23#AISAIS14 Funding Information: • National Center for Advancing Translational Sciences of the National Institutes of Health (UL1 TR000430) • JSPS The Strategic Young Researcher Overseas Visits Program for Accelerating Brain Circulation (S2504) • JSPS KAKENHI (JP16K11589) Acknowledgements: • Dr. Naru Shiraishi, Niigata University, Japan (for experimental procedure development and data collection) • Animal Research Center (ARC) staff at the University of Chicago Saurabh Ghanekar Principal Consultant Kavi Global saurabh@kaviglobal.com Kazutaka Takahashi, Ph.D. Research Assistant Professor Department of Organismal Biology and Anatomy University of Chicago kazutaka@uchicago.edu