SlideShare a Scribd company logo
Arif Akar Seval Çapraz
Presenters
Optical Flow with Semantic
Segmentation and Localized Layers
Laura Sevilla-Lara, Deqing Sun, Varun Jampani, Michael J. Black
Max Plank Institute for Intelligent Systems, Harvard University, Nvidia Corporation
Outline
● Introduction
○ Problem Statement, Key Assumptions
● Semantic Optical Flow
○ Semantic Segmentation, Localized Layers, Model and Methods,
● Experiments
○ Natural Youtube Videos, KITTI 2015
● Conclusion
● Project Stage
Q: What is the aim of Optical Flow Research?
● We are interested in finding the movement of scene objects from time-varying images
(videos).
● Lots of uses
○ Track object behavior
○ Correct for camera jitter (stabilization)
○ Align images (mosaics)
○ 3D shape reconstruction
○ Human Action Recognition
Slide Credit: S. Narasimhan
Problem Statement – Optical Flow
● How to estimate pixel motion from image H to image I?
• Find pixel correspondences
• Given a pixel in H, look for nearby pixels of the same color in I
Key Assumptions
○ Color Constancy: A point in H looks “the same” in image I
■ For grayscale images, this is brightness constancy
○ Small Motion: Points do not move very far
■ Reduce the resolution to solve problems due to this
assumption -> Use Pyramid Representation!
Semantic Optical Flow
What can be improved with existing Optical Flow approaches?
○ Generic, spatially homogenous assumptions about spatial structure of the flow
○ Different objects move differently
○ Handling complex scene motion
○ Handling discontinuities at object boundaries
1. Use Semantic Segmentation
○ Provide information on object boundaries
○ Object class type determine movement type
■ Things, Planes and Stuff
○ Provide information on relative local depth orderings
Semantic Optical Flow
2. Localized Layer Models
○ To handle complex scene
motions and motion
boundaries
○ Not globally modeled,
localized layers
○ Better foreground-
background
representation
Semantic Optical Flow
a. Initial Segmentation b. Resulting Segmentation
c. Discrete Flow d. Resulting Semantic Flow
Semantic Optical Flow
Proposed Method
1. Segmentation with DeepLab [1] and objects class matching
2. Compute Initial flow field with Discrete Flow [2].
3. Initialization and Optimization
4. Composition of the Flow Field
1. L. Chen et. al. Semantic image segmentation with deep convolutional nets and fully connected crfs. CoRR,
abs/1412.7062, 2014.
2. M. Menze et. al. Discrete optimization for optical flow. In German Conference on Pattern Recognition (GCPR), volume
9358, pages 16–28. Springer International Publishing, 2015.
Model and Methods
● Three classes of Objects:
1. Things
○ Defined spatial extent, rigid or non-rigid, move independently, typically foreground
2. Planes
○ Broad spatial extent, roughly planar, typically background
3. Stuff
○ Buildings, vegetation, unknown classes assigned
Model and Methods
● Three classes of Objects:
1. Motion of Things
○ Modeled as affine transformation + smooth deformation from affine
2. Motion of Planes
○ Modeled as homographies, use RANSAC to estimate homography parameters hi
3. Motion of Stuff
○ No specific motion model, set each region to initial flow
Optical Flow with Semantic Segmentation and Localized Layers
Models and Methods
● Data Term: imposes appearance constancy when pixels are visible at the same layer
Models and Methods
● Motion Term: encodes two assumptions;1. neighbor pixels should move together if they
belong to same layer k. 2. pixels from each layers should share a global motion model
where changes over time and depends on object class.
Models and Methods
● Time Term: encourages corresponding pixels over time to have the same layer label.
Models and Methods
● Layer Term: is a coupling term that enforces similarity between layer segmentation and
semantic segmentation
Models and Methods
● Space Term: encourages spatial contiguity of layer segmentation.
Models and Methods
Experiments
1. Natural Youtube Videos (containing objects of Pascal-VOC classes)
○ No ground truth for quantitative analysis
○ Only qualitative results provided
2. KITTI 2015
Overall percentage of outliers compared
Experiments
Experiments
2. KITTI 2015 – Online Competition
Top results among all monocular methods.
http://www.cvlibs.net/datasets/kitti/eval_scene_flow.php?benchmark=flow
Conclusion
● Using semantic segmentation improves optical flow estimation
● Different motion models defined and focused on motion of things
● A key insight is that a detected object region is likely to contain at
most two motions and the object is likely to be in front
Project Stage
● Code of the paper was released (Matlab), DEMO script provided
● We modified the code so that we are able to run the code for whole KITTI
2015 data set (~4.5 hours)
● We experiment on code to see how motion models can be improved
● …..
● …..
● Ultimate Goal: Segmentation can help OF, can OF help segmentation too?
These two processes can be integrated together and can both converge to
outstanding results.
Optical Flow with Semantic Segmentation and Localized Layers
Optical Flow with Semantic Segmentation and Localized Layers

More Related Content

PDF
A Brief History of Object Detection / Tommi Kerola
PDF
Interpretability of Convolutional Neural Networks - Eva Mohedano - UPC Barcel...
PDF
Object Detection and Recognition
PDF
Visual Search and Question Answering II
PDF
CNN vs SIFT-based Visual Localization - Laura Leal-Taixé - UPC Barcelona 2018
PPTX
You only look once: Unified, real-time object detection (UPC Reading Group)
PDF
Human Behavior Understanding: From Human-Oriented Analysis to Action Recognit...
PDF
Multiple Object Tracking - Laura Leal-Taixe - UPC Barcelona 2018
A Brief History of Object Detection / Tommi Kerola
Interpretability of Convolutional Neural Networks - Eva Mohedano - UPC Barcel...
Object Detection and Recognition
Visual Search and Question Answering II
CNN vs SIFT-based Visual Localization - Laura Leal-Taixé - UPC Barcelona 2018
You only look once: Unified, real-time object detection (UPC Reading Group)
Human Behavior Understanding: From Human-Oriented Analysis to Action Recognit...
Multiple Object Tracking - Laura Leal-Taixe - UPC Barcelona 2018

What's hot (20)

PDF
PPTX
Introduction to Graph neural networks @ Vienna Deep Learning meetup
PDF
Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018
PDF
Deep image retrieval learning global representations for image search
PDF
Object Detection - Míriam Bellver - UPC Barcelona 2018
PDF
Architecture Design for Deep Neural Networks III
PDF
V2 v posenet
PDF
Towards Set Learning and Prediction - Laura Leal-Taixe - UPC Barcelona 2018
PDF
Visual Saliency Prediction with Deep Learning - Kevin McGuinness - UPC Barcel...
PDF
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
PPTX
Paper review
PDF
A beginner's guide to Style Transfer and recent trends
PDF
Learning object dynamics in video generation
PPTX
Poster rough draft
PDF
Semantic Segmentation - Míriam Bellver - UPC Barcelona 2018
PPTX
[DL輪読会]ClearGrasp
PDF
Video Object Segmentation - Laura Leal-Taixé - UPC Barcelona 2018
PDF
Modern Convolutional Neural Network techniques for image segmentation
PDF
Deep Generative Models - Kevin McGuinness - UPC Barcelona 2018
PDF
Topology-Preserving Ordering of the RGB Space with an Evolutionary Algorithm
Introduction to Graph neural networks @ Vienna Deep Learning meetup
Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018
Deep image retrieval learning global representations for image search
Object Detection - Míriam Bellver - UPC Barcelona 2018
Architecture Design for Deep Neural Networks III
V2 v posenet
Towards Set Learning and Prediction - Laura Leal-Taixe - UPC Barcelona 2018
Visual Saliency Prediction with Deep Learning - Kevin McGuinness - UPC Barcel...
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Paper review
A beginner's guide to Style Transfer and recent trends
Learning object dynamics in video generation
Poster rough draft
Semantic Segmentation - Míriam Bellver - UPC Barcelona 2018
[DL輪読会]ClearGrasp
Video Object Segmentation - Laura Leal-Taixé - UPC Barcelona 2018
Modern Convolutional Neural Network techniques for image segmentation
Deep Generative Models - Kevin McGuinness - UPC Barcelona 2018
Topology-Preserving Ordering of the RGB Space with an Evolutionary Algorithm
Ad

Similar to Optical Flow with Semantic Segmentation and Localized Layers (20)

PPTX
MOTION FLOW
PDF
Temporal Segment Network
PDF
Optic flow estimation with deep learning
PPT
presentation.ppt
PPTX
Low level feature extraction - chapter 4
PDF
Semantic Video Segmentation with Using Ensemble of Particular Classifiers and...
PDF
IRJET-Motion Segmentation
PDF
Object extraction using edge, motion and saliency information from videos
PDF
Video Object Segmentation in Videos
PPTX
PPT
CS 223-B L9 Odddddddddddddddddddpticadddl Flodzdzdsw.ppt
PDF
Event recognition image & video segmentation
PDF
Discovering Thematic Object in a Video
PPTX
Fast Multi-frame Stereo Scene Flow with Motion Segmentation (CVPR 2017)
PPTX
Nhóm 13-OpticalFlow.pptx
PPTX
Computer vision series
PDF
IRJET- Real Time Video Object Tracking using Motion Estimation
PPTX
Tennis video shot classification based on support vector
PDF
Deep Video Object Segmentation - Xavier Giro - UPC Barcelona 2019
PPTX
Indoor scene understanding for autonomous agents
MOTION FLOW
Temporal Segment Network
Optic flow estimation with deep learning
presentation.ppt
Low level feature extraction - chapter 4
Semantic Video Segmentation with Using Ensemble of Particular Classifiers and...
IRJET-Motion Segmentation
Object extraction using edge, motion and saliency information from videos
Video Object Segmentation in Videos
CS 223-B L9 Odddddddddddddddddddpticadddl Flodzdzdsw.ppt
Event recognition image & video segmentation
Discovering Thematic Object in a Video
Fast Multi-frame Stereo Scene Flow with Motion Segmentation (CVPR 2017)
Nhóm 13-OpticalFlow.pptx
Computer vision series
IRJET- Real Time Video Object Tracking using Motion Estimation
Tennis video shot classification based on support vector
Deep Video Object Segmentation - Xavier Giro - UPC Barcelona 2019
Indoor scene understanding for autonomous agents
Ad

More from Seval Çapraz (20)

PPTX
A Quick Start To Blockchain by Seval Capraz
PDF
Yapay Sinir Ağları ile çiftler ticareti finansal tahmin pepsi cocacola örneği
PDF
Etu Location
PDF
Assembly Dili İle Binary Search Gerçekleştirimi
PDF
Zimbra zooms ahead with OneView
PDF
Software Project Management Plan
PDF
Distributed Computing Answers
PDF
Statistical Data Analysis on a Data Set (Diabetes 130-US hospitals for years ...
PDF
Statistical Data Analysis on Diabetes 130-US hospitals for years 1999-2008 Da...
PDF
VARIABILITY MANAGEMENT IN SOFTWARE PRODUCT LINES
PDF
A Content Boosted Hybrid Recommendation System
PDF
Importance of software quality assurance to prevent and reduce software failu...
PDF
A Document Management System in Defense Industry Case Study
PDF
Comparison of Parallel Algorithms For An Image Processing Problem on Cuda
PDF
GPU-Accelerated Route Planning of Multi-UAV Systems Using Simulated Annealing...
PDF
Semantic Filtering (An Image Processing Method)
PDF
Spam Tanıma İçin Geliştirilmiş Güncel Yöntemlere Genel Bakış | Seval Çapraz
PDF
Data Streaming For Big Data
PPTX
What is Datamining? Which algorithms can be used for Datamining?
PDF
Bir Android Uygulamasında Bulunması Gereken Özellikler | Seval ZX | Android D...
A Quick Start To Blockchain by Seval Capraz
Yapay Sinir Ağları ile çiftler ticareti finansal tahmin pepsi cocacola örneği
Etu Location
Assembly Dili İle Binary Search Gerçekleştirimi
Zimbra zooms ahead with OneView
Software Project Management Plan
Distributed Computing Answers
Statistical Data Analysis on a Data Set (Diabetes 130-US hospitals for years ...
Statistical Data Analysis on Diabetes 130-US hospitals for years 1999-2008 Da...
VARIABILITY MANAGEMENT IN SOFTWARE PRODUCT LINES
A Content Boosted Hybrid Recommendation System
Importance of software quality assurance to prevent and reduce software failu...
A Document Management System in Defense Industry Case Study
Comparison of Parallel Algorithms For An Image Processing Problem on Cuda
GPU-Accelerated Route Planning of Multi-UAV Systems Using Simulated Annealing...
Semantic Filtering (An Image Processing Method)
Spam Tanıma İçin Geliştirilmiş Güncel Yöntemlere Genel Bakış | Seval Çapraz
Data Streaming For Big Data
What is Datamining? Which algorithms can be used for Datamining?
Bir Android Uygulamasında Bulunması Gereken Özellikler | Seval ZX | Android D...

Recently uploaded (20)

PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
A comparative analysis of optical character recognition models for extracting...
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPTX
sap open course for s4hana steps from ECC to s4
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
Cloud computing and distributed systems.
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
A Presentation on Artificial Intelligence
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
Machine Learning_overview_presentation.pptx
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Machine learning based COVID-19 study performance prediction
Mobile App Security Testing_ A Comprehensive Guide.pdf
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
A comparative analysis of optical character recognition models for extracting...
NewMind AI Weekly Chronicles - August'25-Week II
Agricultural_Statistics_at_a_Glance_2022_0.pdf
sap open course for s4hana steps from ECC to s4
Dropbox Q2 2025 Financial Results & Investor Presentation
Cloud computing and distributed systems.
The Rise and Fall of 3GPP – Time for a Sabbatical?
Chapter 3 Spatial Domain Image Processing.pdf
A Presentation on Artificial Intelligence
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Machine Learning_overview_presentation.pptx
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
The AUB Centre for AI in Media Proposal.docx
Network Security Unit 5.pdf for BCA BBA.
Spectral efficient network and resource selection model in 5G networks
Review of recent advances in non-invasive hemoglobin estimation
Machine learning based COVID-19 study performance prediction

Optical Flow with Semantic Segmentation and Localized Layers

  • 1. Arif Akar Seval Çapraz Presenters Optical Flow with Semantic Segmentation and Localized Layers Laura Sevilla-Lara, Deqing Sun, Varun Jampani, Michael J. Black Max Plank Institute for Intelligent Systems, Harvard University, Nvidia Corporation
  • 2. Outline ● Introduction ○ Problem Statement, Key Assumptions ● Semantic Optical Flow ○ Semantic Segmentation, Localized Layers, Model and Methods, ● Experiments ○ Natural Youtube Videos, KITTI 2015 ● Conclusion ● Project Stage
  • 3. Q: What is the aim of Optical Flow Research? ● We are interested in finding the movement of scene objects from time-varying images (videos). ● Lots of uses ○ Track object behavior ○ Correct for camera jitter (stabilization) ○ Align images (mosaics) ○ 3D shape reconstruction ○ Human Action Recognition Slide Credit: S. Narasimhan
  • 4. Problem Statement – Optical Flow ● How to estimate pixel motion from image H to image I? • Find pixel correspondences • Given a pixel in H, look for nearby pixels of the same color in I
  • 5. Key Assumptions ○ Color Constancy: A point in H looks “the same” in image I ■ For grayscale images, this is brightness constancy ○ Small Motion: Points do not move very far ■ Reduce the resolution to solve problems due to this assumption -> Use Pyramid Representation!
  • 6. Semantic Optical Flow What can be improved with existing Optical Flow approaches? ○ Generic, spatially homogenous assumptions about spatial structure of the flow ○ Different objects move differently ○ Handling complex scene motion ○ Handling discontinuities at object boundaries
  • 7. 1. Use Semantic Segmentation ○ Provide information on object boundaries ○ Object class type determine movement type ■ Things, Planes and Stuff ○ Provide information on relative local depth orderings Semantic Optical Flow
  • 8. 2. Localized Layer Models ○ To handle complex scene motions and motion boundaries ○ Not globally modeled, localized layers ○ Better foreground- background representation Semantic Optical Flow
  • 9. a. Initial Segmentation b. Resulting Segmentation c. Discrete Flow d. Resulting Semantic Flow Semantic Optical Flow
  • 10. Proposed Method 1. Segmentation with DeepLab [1] and objects class matching 2. Compute Initial flow field with Discrete Flow [2]. 3. Initialization and Optimization 4. Composition of the Flow Field 1. L. Chen et. al. Semantic image segmentation with deep convolutional nets and fully connected crfs. CoRR, abs/1412.7062, 2014. 2. M. Menze et. al. Discrete optimization for optical flow. In German Conference on Pattern Recognition (GCPR), volume 9358, pages 16–28. Springer International Publishing, 2015.
  • 11. Model and Methods ● Three classes of Objects: 1. Things ○ Defined spatial extent, rigid or non-rigid, move independently, typically foreground 2. Planes ○ Broad spatial extent, roughly planar, typically background 3. Stuff ○ Buildings, vegetation, unknown classes assigned
  • 12. Model and Methods ● Three classes of Objects: 1. Motion of Things ○ Modeled as affine transformation + smooth deformation from affine 2. Motion of Planes ○ Modeled as homographies, use RANSAC to estimate homography parameters hi 3. Motion of Stuff ○ No specific motion model, set each region to initial flow
  • 15. ● Data Term: imposes appearance constancy when pixels are visible at the same layer Models and Methods
  • 16. ● Motion Term: encodes two assumptions;1. neighbor pixels should move together if they belong to same layer k. 2. pixels from each layers should share a global motion model where changes over time and depends on object class. Models and Methods
  • 17. ● Time Term: encourages corresponding pixels over time to have the same layer label. Models and Methods
  • 18. ● Layer Term: is a coupling term that enforces similarity between layer segmentation and semantic segmentation Models and Methods
  • 19. ● Space Term: encourages spatial contiguity of layer segmentation. Models and Methods
  • 20. Experiments 1. Natural Youtube Videos (containing objects of Pascal-VOC classes) ○ No ground truth for quantitative analysis ○ Only qualitative results provided
  • 21. 2. KITTI 2015 Overall percentage of outliers compared Experiments
  • 22. Experiments 2. KITTI 2015 – Online Competition Top results among all monocular methods. http://www.cvlibs.net/datasets/kitti/eval_scene_flow.php?benchmark=flow
  • 23. Conclusion ● Using semantic segmentation improves optical flow estimation ● Different motion models defined and focused on motion of things ● A key insight is that a detected object region is likely to contain at most two motions and the object is likely to be in front
  • 24. Project Stage ● Code of the paper was released (Matlab), DEMO script provided ● We modified the code so that we are able to run the code for whole KITTI 2015 data set (~4.5 hours) ● We experiment on code to see how motion models can be improved ● ….. ● ….. ● Ultimate Goal: Segmentation can help OF, can OF help segmentation too? These two processes can be integrated together and can both converge to outstanding results.