SlideShare a Scribd company logo
1
Fast Multi-frame Stereo Scene Flow
with Motion Segmentation
Tatsunori Taniai*
RIKEN AIP
Sudipta N. Sinha
Microsoft Research
Yoichi Sato
The University of Tokyo
CVPR 2017 Paper
* Work done during internship at Microsoft Research and partly at the University of Tokyo.
2
3
Contributions
• New unified framework
– Stereo (depth / disparity)
– Optical flow (2D motion field)
– Motion segmentation (binary mask of moving objects)
– Visual odometry (6 DoF camera ego-motion)
In our framework
• Result of each task benefits others, leading to higher accuracy and efficiency
• Joint task is decomposed into simple optimization problems (in contrast to
existing joint methods)
Results
• Accurate: achieved 3rd rank on KITTI benchmark
• Fast: 10~1000x faster than state-of-the-art methods
5
Scene Flow: Problem Definition
𝑿 𝑡 = 𝑥 𝑡, 𝑦𝑡, 𝑧𝑡
𝑝𝑝′
𝐼𝑡
0
𝐼𝑡
1
Stereo disparity
1D horizontal translation
by object depth 𝑧
𝑝′
6
Scene Flow: Problem Definition
𝐼𝑡
0
𝐼𝑡+1
0
𝐼𝑡
1
𝐼𝑡+1
1
𝑿 𝑡
𝑝
𝑝′
Optical flow
2D translation
by camera and
object motions
𝑝′
𝑿 𝑡+1
7
Scene Flow: Problem Definition
𝑿 𝑡
Stereo disparity
1D horizontal translation
by object depth 𝑧
𝐼𝑡
0
𝐼𝑡+1
0
𝑿 𝑡+1
Optical flow
2D translation
by camera and
object motions
All together implicitly
represent 3D
motions of points
8
Applications
Autonomous driving
[Menze+ CVPR 15]
Action recognition
[Wang+ CVPR 11]
Depth and flow map sequences are useful in many applications
But optical flow estimation is VERY SLOW.
9
Overview
• Introduction
• Motivation
• Proposed method
• Experiments
10
Optical Flow vs Stereo
Optical flow Stereo matching
 1D translation 2D translationSearch space
Motion factor  Object motion,
Ego-motion, etc.
 Object depth
Optical flow is much more difficult & expensive than stereo
11
Dominant Rigid Scene Assumption
Most of the points are static.
Their flows are due to camera motions.
12
Flow Estimation by Depth and Camera Motion
𝐼𝑡
Rigid flow map Ground truth flow map Error map of rigid flow
Surface 𝐷𝑡
𝐼𝑡+1
Surface 𝐷𝑡+1
𝐼𝑡 𝐼𝑡+1
Given rigid flow map,
we only need to
recompute flow for
moving objects.
13
Overview
• Introduction
• Motivation
• Proposed method
• Experiments
14
Proposed Approach
Visual
odometry
Initial motion
segmentation
Optical
flow
𝐼𝑡
0
, 𝐼𝑡+1
0
Frig
Rigid flow S
Init. seg.
Epipolar
stereo
𝐼𝑡±1
0,1
, 𝐼𝑡
0
, 𝐼𝑡
1
Flow fusion
Fnon
Non-rigid flow
𝐼𝑡
0
, 𝐼𝑡+1
0
+ D + 𝐏, D+ 𝐏
𝐏
Ego-motion
D
Disparity
+ S
Binocular
stereo
𝐼𝑡
0
, 𝐼𝑡
1
D
Init. disparity
𝐼𝑡±1
0,1
, 𝐼𝑡
0
, 𝐼𝑡
1 𝐼𝑡
0
, 𝐼𝑡+1
0
+Frig, Fnon
F
Flow
S
Motion seg.
Input
15
Optimization Strategy
𝐸 𝚯 =
𝑝
𝐼𝑡
0
𝑝 − 𝐼𝑡+1
0
𝑤(𝑝; 𝚯)
Minimize image residuals
16
Optimization Strategy
𝐸 D, 𝐏, S, Fnon =
𝑝
𝐼𝑡
0
𝑝 − 𝐼𝑡+1
0
𝑤 𝑝; D, 𝐏, S, Fnon
𝑤 𝑝; D, 𝐏𝑤 𝑝 𝑤 𝑝; D, 𝐏, S, F 𝑛𝑜𝑛
Minimize image residuals
by gradually increasing complexity of the warping model.
Visual
odometry
Initial motion
segmentation
Optical
flow
Epipolar
stereo
Flow fusion
Binocular
stereo
Rigid warping Partially non-rigid warping
17
Intermediate Step: Binocular Stereo
Visual
odometry
Initial motion
segmentation
Optical
flow
Epipolar
stereo
Flow fusion
Binocular
stereo
D
Init. disparity
18
Intermediate Step: Binocular Stereo
Visual
odometry
Initial motion
segmentation
Optical
flow
Epipolar
stereo
Flow fusion
Binocular
stereo
Initial disparity map Left-right occlusion map Uncertainty map
• SGM stereo
• NCC-based matching costs
• Left-right consistency check • Using [Drory 2014]
(no computational overhead)
19
Intermediate Step: Visual Odometry
Visual
odometry
Initial motion
segmentation
Optical
flow
Epipolar
stereo
Flow fusion
𝐏
Ego-motion
Binocular
stereo
20
Intermediate Step: Visual Odometry
Visual
odometry
Initial motion
segmentation
Optical
flow
Epipolar
stereo
Flow fusion
Binocular
stereo
min
𝐏
𝐸 𝐏|D =
𝑝
𝑤 𝑝 𝜌 𝐼𝑡 𝑝 − 𝐼𝑡+1 𝑤 𝑝; D, 𝐏
Use [Alismail+ CMU-TR14]
• Estimate the 6DoF camera motion by directly minimizing image residuals
• Iteratively reweighted least squares (Lucas-Kanade + inverse compositional)
+ Down-weight moving object regions predicted by flow F 𝑡−1 and mask S 𝑡−1
𝜌: robust penalty function
Rigid warping
21
Intermediate Step: Epipolar Stereo
Visual
odometry
Initial motion
segmentation
Optical
flow
Epipolar
stereo
Flow fusion
D
Disparity
Binocular
stereo
22
Intermediate Step: Epipolar Stereo
Visual
odometry
Initial motion
segmentation
Optical
flow
Epipolar
stereo
Flow fusion
Binocular
stereo
Left-right 𝐼𝑡
0
, 𝐼𝑡
1
matching is unreliable by occlusion
𝐼𝑡
0
𝐼𝑡
1
𝐼𝑡+1
0
𝐼𝑡+1
1
𝐼𝑡−1
0
𝐼𝑡−1
1
• Blend matching costs with four adjacent frames
(using estimated poses 𝐏𝑡, 𝐏𝑡−1)
• High uncertainty → high weights on adjacent
frame matching
Occlusion map
Uncertainty map
23
Intermediate Step: Epipolar Stereo
Visual
odometry
Initial motion
segmentation
Optical
flow
Epipolar
stereo
Flow fusion
Binocular
stereo
Final disparity mapInitial disparity map
• Run SGM stereo again using blended matching costs
• Disparities are improved at occluded regions
Ground truth
24
Intermediate Step: Initial Motion Segmentation
Visual
odometry
Initial motion
segmentation
Optical
flow
Frig
Rigid flow
S
Epipolar
stereo
Flow fusion
Binocular
stereo
Initial segmentation
25
Intermediate Step: Initial Motion Segmentation
Visual
odometry
Initial motion
segmentation
Optical
flow
Epipolar
stereo
Flow fusion
Binocular
stereo
• Predict moving-object regions where rigid flow proposal is inaccurate
Ground truthRigid flow proposal Initial segmentation
26
Intermediate Step: Initial Motion Segmentation
Visual
odometry
Initial motion
segmentation
Optical
flow
Epipolar
stereo
Flow fusion
Binocular
stereo
• Predict moving-object regions where rigid flow proposal is inaccurate
• Use image redisuals as soft seeds in GragCut-based segmentation
Image residualRigid flow proposal Initial segmentation
27
Intermediate Step: Optical Flow
Visual
odometry
Initial motion
segmentation
Optical
flow
Epipolar
stereo
Flow fusion
Fnon
Non-rigid flow
Binocular
stereo
28
Intermediate Step: Optical Flow
Visual
odometry
Initial motion
segmentation
Optical
flow
Epipolar
stereo
Flow fusion
Binocular
stereo
Initial segmentation Non-rigid flow proposal
• Estimate non-rigid flow for predicted moving-object regions
• Extend the SGM algorithm to optical flow
29
Intermediate Step: Flow Fusion
Visual
odometry
Initial motion
segmentation
Optical
flow
Epipolar
stereo
Flow fusion
Binocular
stereo
F
Flow
S
Motion seg.
30
Intermediate Step: Flow Fusion
Visual
odometry
Initial motion
segmentation
Optical
flow
Epipolar
stereo
Flow fusion
Binocular
stereo
Final flow map
Final motion segmentation
Rigid flow proposal
Non-rigid flow proposal
Fusion
Binary labeling
white: non-rigid
black: rigid
31
Overview
• Introduction
• Motivation
• Proposed method
• Experiments
32
KITTI 2015 Scene Flow Benchmark
Our method is ranked 3rd (November 2016)
200 road scenes with multiple moving objects
34
35
Summary of This Research
• New unified framework
– Stereo (depth / disparity)
– Optical flow (2D motion field)
– Motion segmentation (binary mask of moving objects)
– Visual odometry (6 DoF camera ego-motion)
• Accurate: achieved 3rd rank on KITTI benchmark
• Fast: 10 - 1000x faster than state-of-the-art methods
36
RIKEN AIP is a wonderful place
for young researchers and students.
Contact me for internship opportunities
Take home message
日本橋オフィス
DGX-1 x 24
37

More Related Content

PDF
[2023] Cut and Learn for Unsupervised Object Detection and Instance Segmentation
PDF
[5분 논문요약] Structured Knowledge Distillation for Semantic Segmentation
PDF
Image Restoration for 3D Computer Vision
PDF
Mask R-CNN
PPTX
You only look once
PDF
Attention mechanism 소개 자료
PDF
Neural Radiance Fields & Neural Rendering.pdf
PPTX
Tutorial on Object Detection (Faster R-CNN)
[2023] Cut and Learn for Unsupervised Object Detection and Instance Segmentation
[5분 논문요약] Structured Knowledge Distillation for Semantic Segmentation
Image Restoration for 3D Computer Vision
Mask R-CNN
You only look once
Attention mechanism 소개 자료
Neural Radiance Fields & Neural Rendering.pdf
Tutorial on Object Detection (Faster R-CNN)

What's hot (20)

PPTX
[DL輪読会]Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial...
PPTX
Semantic segmentation with Convolutional Neural Network Approaches
PDF
FastDepth: Fast Monocular Depth Estimation on Embedded Systems
PDF
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
PDF
Liver segmentation using U-net: Practical issues @ SNU-TF
PDF
コンピューテーショナルフォトグラフィ
PDF
[DL輪読会]SlowFast Networks for Video Recognition
PDF
[DL輪読会]A Simple Unified Framework for Detecting Out-of-Distribution Samples a...
PDF
PR-214: FlowNet: Learning Optical Flow with Convolutional Networks
PPTX
【DL輪読会】DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Dri...
PDF
局所特徴量と統計学習手法による物体検出
PPTX
画像認識と深層学習
PPTX
Active contour segmentation
PDF
PR-231: A Simple Framework for Contrastive Learning of Visual Representations
PPTX
Review SRGAN
PDF
semantic segmentation サーベイ
PPTX
Segmentation Techniques -I
PPTX
Object Detection Methods using Deep Learning
PDF
Hyperoptとその周辺について
PDF
Image segmentation with deep learning
[DL輪読会]Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial...
Semantic segmentation with Convolutional Neural Network Approaches
FastDepth: Fast Monocular Depth Estimation on Embedded Systems
PR-302: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
Liver segmentation using U-net: Practical issues @ SNU-TF
コンピューテーショナルフォトグラフィ
[DL輪読会]SlowFast Networks for Video Recognition
[DL輪読会]A Simple Unified Framework for Detecting Out-of-Distribution Samples a...
PR-214: FlowNet: Learning Optical Flow with Convolutional Networks
【DL輪読会】DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Dri...
局所特徴量と統計学習手法による物体検出
画像認識と深層学習
Active contour segmentation
PR-231: A Simple Framework for Contrastive Learning of Visual Representations
Review SRGAN
semantic segmentation サーベイ
Segmentation Techniques -I
Object Detection Methods using Deep Learning
Hyperoptとその周辺について
Image segmentation with deep learning
Ad

Similar to Fast Multi-frame Stereo Scene Flow with Motion Segmentation (CVPR 2017) (20)

PDF
=iros16tutorial_2.pdf
PPT
CS 223-B L9 Odddddddddddddddddddpticadddl Flodzdzdsw.ppt
PDF
Deep VO and SLAM IV
PDF
"3D from 2D: Theory, Implementation, and Applications of Structure from Motio...
PDF
Visual Odometry using Stereo Vision
PDF
Integration of poses to enhance the shape of the object tracking from a singl...
PDF
Dense Visual Odometry Using Genetic Algorithm
PDF
Video Object Segmentation in Videos
PDF
An Assessment of Image Matching Algorithms in Depth Estimation
PDF
Optic flow estimation with deep learning
PDF
EVALUATION OF THE VISUAL ODOMETRY METHODS FOR SEMI-DENSE REAL-TIME
PDF
Deep Learning for Structure-from-Motion (SfM)
PDF
Estimation of Terrain Gradient Conditions & Obstacle Detection Using a Monocu...
PDF
mid_presentation
PDF
Object tracking using motion flow projection for pan-tilt configuration
PDF
Final Paper
PDF
06 robot vision
PDF
Temporal Segment Network
PDF
Visual Mapping and Collision Avoidance Dynamic Environments in Dynamic Enviro...
PPTX
MOTION FLOW
=iros16tutorial_2.pdf
CS 223-B L9 Odddddddddddddddddddpticadddl Flodzdzdsw.ppt
Deep VO and SLAM IV
"3D from 2D: Theory, Implementation, and Applications of Structure from Motio...
Visual Odometry using Stereo Vision
Integration of poses to enhance the shape of the object tracking from a singl...
Dense Visual Odometry Using Genetic Algorithm
Video Object Segmentation in Videos
An Assessment of Image Matching Algorithms in Depth Estimation
Optic flow estimation with deep learning
EVALUATION OF THE VISUAL ODOMETRY METHODS FOR SEMI-DENSE REAL-TIME
Deep Learning for Structure-from-Motion (SfM)
Estimation of Terrain Gradient Conditions & Obstacle Detection Using a Monocu...
mid_presentation
Object tracking using motion flow projection for pan-tilt configuration
Final Paper
06 robot vision
Temporal Segment Network
Visual Mapping and Collision Avoidance Dynamic Environments in Dynamic Enviro...
MOTION FLOW
Ad

Recently uploaded (20)

PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Empathic Computing: Creating Shared Understanding
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Electronic commerce courselecture one. Pdf
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPTX
Cloud computing and distributed systems.
PPTX
Programs and apps: productivity, graphics, security and other tools
PPTX
Spectroscopy.pptx food analysis technology
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPT
Teaching material agriculture food technology
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPTX
sap open course for s4hana steps from ECC to s4
PDF
Chapter 3 Spatial Domain Image Processing.pdf
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
NewMind AI Weekly Chronicles - August'25 Week I
20250228 LYD VKU AI Blended-Learning.pptx
Encapsulation_ Review paper, used for researhc scholars
Empathic Computing: Creating Shared Understanding
Advanced methodologies resolving dimensionality complications for autism neur...
Electronic commerce courselecture one. Pdf
Reach Out and Touch Someone: Haptics and Empathic Computing
Cloud computing and distributed systems.
Programs and apps: productivity, graphics, security and other tools
Spectroscopy.pptx food analysis technology
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
MIND Revenue Release Quarter 2 2025 Press Release
Diabetes mellitus diagnosis method based random forest with bat algorithm
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Teaching material agriculture food technology
Mobile App Security Testing_ A Comprehensive Guide.pdf
sap open course for s4hana steps from ECC to s4
Chapter 3 Spatial Domain Image Processing.pdf
The AUB Centre for AI in Media Proposal.docx
Understanding_Digital_Forensics_Presentation.pptx
NewMind AI Weekly Chronicles - August'25 Week I

Fast Multi-frame Stereo Scene Flow with Motion Segmentation (CVPR 2017)

  • 1. 1 Fast Multi-frame Stereo Scene Flow with Motion Segmentation Tatsunori Taniai* RIKEN AIP Sudipta N. Sinha Microsoft Research Yoichi Sato The University of Tokyo CVPR 2017 Paper * Work done during internship at Microsoft Research and partly at the University of Tokyo.
  • 2. 2
  • 3. 3 Contributions • New unified framework – Stereo (depth / disparity) – Optical flow (2D motion field) – Motion segmentation (binary mask of moving objects) – Visual odometry (6 DoF camera ego-motion) In our framework • Result of each task benefits others, leading to higher accuracy and efficiency • Joint task is decomposed into simple optimization problems (in contrast to existing joint methods) Results • Accurate: achieved 3rd rank on KITTI benchmark • Fast: 10~1000x faster than state-of-the-art methods
  • 4. 5 Scene Flow: Problem Definition 𝑿 𝑡 = 𝑥 𝑡, 𝑦𝑡, 𝑧𝑡 𝑝𝑝′ 𝐼𝑡 0 𝐼𝑡 1 Stereo disparity 1D horizontal translation by object depth 𝑧 𝑝′
  • 5. 6 Scene Flow: Problem Definition 𝐼𝑡 0 𝐼𝑡+1 0 𝐼𝑡 1 𝐼𝑡+1 1 𝑿 𝑡 𝑝 𝑝′ Optical flow 2D translation by camera and object motions 𝑝′ 𝑿 𝑡+1
  • 6. 7 Scene Flow: Problem Definition 𝑿 𝑡 Stereo disparity 1D horizontal translation by object depth 𝑧 𝐼𝑡 0 𝐼𝑡+1 0 𝑿 𝑡+1 Optical flow 2D translation by camera and object motions All together implicitly represent 3D motions of points
  • 7. 8 Applications Autonomous driving [Menze+ CVPR 15] Action recognition [Wang+ CVPR 11] Depth and flow map sequences are useful in many applications But optical flow estimation is VERY SLOW.
  • 8. 9 Overview • Introduction • Motivation • Proposed method • Experiments
  • 9. 10 Optical Flow vs Stereo Optical flow Stereo matching  1D translation 2D translationSearch space Motion factor  Object motion, Ego-motion, etc.  Object depth Optical flow is much more difficult & expensive than stereo
  • 10. 11 Dominant Rigid Scene Assumption Most of the points are static. Their flows are due to camera motions.
  • 11. 12 Flow Estimation by Depth and Camera Motion 𝐼𝑡 Rigid flow map Ground truth flow map Error map of rigid flow Surface 𝐷𝑡 𝐼𝑡+1 Surface 𝐷𝑡+1 𝐼𝑡 𝐼𝑡+1 Given rigid flow map, we only need to recompute flow for moving objects.
  • 12. 13 Overview • Introduction • Motivation • Proposed method • Experiments
  • 13. 14 Proposed Approach Visual odometry Initial motion segmentation Optical flow 𝐼𝑡 0 , 𝐼𝑡+1 0 Frig Rigid flow S Init. seg. Epipolar stereo 𝐼𝑡±1 0,1 , 𝐼𝑡 0 , 𝐼𝑡 1 Flow fusion Fnon Non-rigid flow 𝐼𝑡 0 , 𝐼𝑡+1 0 + D + 𝐏, D+ 𝐏 𝐏 Ego-motion D Disparity + S Binocular stereo 𝐼𝑡 0 , 𝐼𝑡 1 D Init. disparity 𝐼𝑡±1 0,1 , 𝐼𝑡 0 , 𝐼𝑡 1 𝐼𝑡 0 , 𝐼𝑡+1 0 +Frig, Fnon F Flow S Motion seg. Input
  • 14. 15 Optimization Strategy 𝐸 𝚯 = 𝑝 𝐼𝑡 0 𝑝 − 𝐼𝑡+1 0 𝑤(𝑝; 𝚯) Minimize image residuals
  • 15. 16 Optimization Strategy 𝐸 D, 𝐏, S, Fnon = 𝑝 𝐼𝑡 0 𝑝 − 𝐼𝑡+1 0 𝑤 𝑝; D, 𝐏, S, Fnon 𝑤 𝑝; D, 𝐏𝑤 𝑝 𝑤 𝑝; D, 𝐏, S, F 𝑛𝑜𝑛 Minimize image residuals by gradually increasing complexity of the warping model. Visual odometry Initial motion segmentation Optical flow Epipolar stereo Flow fusion Binocular stereo Rigid warping Partially non-rigid warping
  • 16. 17 Intermediate Step: Binocular Stereo Visual odometry Initial motion segmentation Optical flow Epipolar stereo Flow fusion Binocular stereo D Init. disparity
  • 17. 18 Intermediate Step: Binocular Stereo Visual odometry Initial motion segmentation Optical flow Epipolar stereo Flow fusion Binocular stereo Initial disparity map Left-right occlusion map Uncertainty map • SGM stereo • NCC-based matching costs • Left-right consistency check • Using [Drory 2014] (no computational overhead)
  • 18. 19 Intermediate Step: Visual Odometry Visual odometry Initial motion segmentation Optical flow Epipolar stereo Flow fusion 𝐏 Ego-motion Binocular stereo
  • 19. 20 Intermediate Step: Visual Odometry Visual odometry Initial motion segmentation Optical flow Epipolar stereo Flow fusion Binocular stereo min 𝐏 𝐸 𝐏|D = 𝑝 𝑤 𝑝 𝜌 𝐼𝑡 𝑝 − 𝐼𝑡+1 𝑤 𝑝; D, 𝐏 Use [Alismail+ CMU-TR14] • Estimate the 6DoF camera motion by directly minimizing image residuals • Iteratively reweighted least squares (Lucas-Kanade + inverse compositional) + Down-weight moving object regions predicted by flow F 𝑡−1 and mask S 𝑡−1 𝜌: robust penalty function Rigid warping
  • 20. 21 Intermediate Step: Epipolar Stereo Visual odometry Initial motion segmentation Optical flow Epipolar stereo Flow fusion D Disparity Binocular stereo
  • 21. 22 Intermediate Step: Epipolar Stereo Visual odometry Initial motion segmentation Optical flow Epipolar stereo Flow fusion Binocular stereo Left-right 𝐼𝑡 0 , 𝐼𝑡 1 matching is unreliable by occlusion 𝐼𝑡 0 𝐼𝑡 1 𝐼𝑡+1 0 𝐼𝑡+1 1 𝐼𝑡−1 0 𝐼𝑡−1 1 • Blend matching costs with four adjacent frames (using estimated poses 𝐏𝑡, 𝐏𝑡−1) • High uncertainty → high weights on adjacent frame matching Occlusion map Uncertainty map
  • 22. 23 Intermediate Step: Epipolar Stereo Visual odometry Initial motion segmentation Optical flow Epipolar stereo Flow fusion Binocular stereo Final disparity mapInitial disparity map • Run SGM stereo again using blended matching costs • Disparities are improved at occluded regions Ground truth
  • 23. 24 Intermediate Step: Initial Motion Segmentation Visual odometry Initial motion segmentation Optical flow Frig Rigid flow S Epipolar stereo Flow fusion Binocular stereo Initial segmentation
  • 24. 25 Intermediate Step: Initial Motion Segmentation Visual odometry Initial motion segmentation Optical flow Epipolar stereo Flow fusion Binocular stereo • Predict moving-object regions where rigid flow proposal is inaccurate Ground truthRigid flow proposal Initial segmentation
  • 25. 26 Intermediate Step: Initial Motion Segmentation Visual odometry Initial motion segmentation Optical flow Epipolar stereo Flow fusion Binocular stereo • Predict moving-object regions where rigid flow proposal is inaccurate • Use image redisuals as soft seeds in GragCut-based segmentation Image residualRigid flow proposal Initial segmentation
  • 26. 27 Intermediate Step: Optical Flow Visual odometry Initial motion segmentation Optical flow Epipolar stereo Flow fusion Fnon Non-rigid flow Binocular stereo
  • 27. 28 Intermediate Step: Optical Flow Visual odometry Initial motion segmentation Optical flow Epipolar stereo Flow fusion Binocular stereo Initial segmentation Non-rigid flow proposal • Estimate non-rigid flow for predicted moving-object regions • Extend the SGM algorithm to optical flow
  • 28. 29 Intermediate Step: Flow Fusion Visual odometry Initial motion segmentation Optical flow Epipolar stereo Flow fusion Binocular stereo F Flow S Motion seg.
  • 29. 30 Intermediate Step: Flow Fusion Visual odometry Initial motion segmentation Optical flow Epipolar stereo Flow fusion Binocular stereo Final flow map Final motion segmentation Rigid flow proposal Non-rigid flow proposal Fusion Binary labeling white: non-rigid black: rigid
  • 30. 31 Overview • Introduction • Motivation • Proposed method • Experiments
  • 31. 32 KITTI 2015 Scene Flow Benchmark Our method is ranked 3rd (November 2016) 200 road scenes with multiple moving objects
  • 32. 34
  • 33. 35 Summary of This Research • New unified framework – Stereo (depth / disparity) – Optical flow (2D motion field) – Motion segmentation (binary mask of moving objects) – Visual odometry (6 DoF camera ego-motion) • Accurate: achieved 3rd rank on KITTI benchmark • Fast: 10 - 1000x faster than state-of-the-art methods
  • 34. 36 RIKEN AIP is a wonderful place for young researchers and students. Contact me for internship opportunities Take home message 日本橋オフィス DGX-1 x 24
  • 35. 37