SlideShare a Scribd company logo
DTAM:Dense Tracking and Mapping in Real-Time
NewCombe, Lovegrove & Davision ICCV11
Outline
➢ Introduction
➢ Related Work
➢ System Overview
➢ Dense Mapping
➢ Dense Tracking
➢ Evaluation and Results
➢ Conclusions and Future Work
Outline
➢ Introduction
➢ Related Work
➢ System Overview
➢ Dense Mapping
➢ Dense Tracking
➢ Evaluation and Results
➢ Conclusions and Future Work
Introduction
● Dense Tracking and Mapping in Real-Time
● DTAM is a system for real-time camera tracking
and reconstruction which relies not on feature
extraction but dense, every pixel methods.
● Simultaneous frame-rate Tracking and Dense
Mapping
Outline
➢ Introduction
➢ Related Work
➢ System Overview
➢ Dense Mapping
➢ Dense Tracking
➢ Evaluation and Results
➢ Conclusions and Future Work
Related Work
● Real-time SFM(Structure from Motion)
● PTAM(G. Klein and D. W. Murray. ISMAR 2007)
● Improving the agility of keyframe-based SLAM(G.
Klein and D. W. Murray. ECCV 2008)
● Live dense reconstruction with a single moving
camera(R. A. Newcombe and A. J. Davison. CVPR
2010)
● Real-time dense geometry from a handheld
camera(J. Stuehmer et.al. 2010)
Outline
➢ Introduction
➢ Related Work
➢ System Overview
➢ Dense Mapping
➢ Dense Tracking
➢ Evaluation and Results
➢ Conclusions and Future Work
System Overview
● Input
-Single hand held RGB Camera
●
● Objective:
-Dense Mapping
-Dense Tracking
Input Imgage
3D Dense Map
System Overview
Outline
➢ Introduction
➢ Related Work
➢ System Overview
➢ Dense Mapping
➢ Dense Tracking
➢ Evaluation and Results
➢ Conclusions and Future Work
Dense Mapping
● Estimate inverse depth map from bundles of
images
Photometric error
● Total cost
● Photometric error
● Where:
- K ...intrinsic matrix
- Tmr ...transformation from m to r
-
-
Depth map estimation
●
Principle:
- S depth hypothesis are considered for
each pixel of the reference image Ir
- Each corresponding 3D point is projected
onto a bundle of images Im
- Keep the depth hypothesis that best
respects the color consistency from the
reference to the bundle of images
●
Formulation:
- :pixel position and depth hypothesis
- :number of valid reprojection of the pixel in the bundle
- :photometric error between reference and current image
Depth map estimation
Example reference image pixel
Reprojection of depth
Hypotheses on one image
of bundle
Rerojectionin
ImagebundlePhotomtricerror
Depth Hypotheses
Inverse Depth Map Computation
● Inverse depth map can be computed by
minimizing the photometric error(exhaustive
search ove the volume):
● But featureless regions are
prone to false minima
Inverse Depth Map Computation
Depth map filtering approach
●
Problem:
- Uniform regions in reference image do not give discriminative enough
photometric error
● Idea:
- Assume the depth is smooth on uniform regions
- Use total variational approach where depth map is the functional to optimize:
*photometric error defines the data term
*the smoothness constraint defines the regularization
Inverse Depth Map Computation
● Featureless regions are prone to false minima
● Solution:Regularization term
- We want to penalize deviation from spatially smooth
solution
- But preserve edges and discontinuities
Depth map filtering approach
●
Formulation of the variational approach
- First term: regularization constraint, g is defined as 0 for image gradients and 1
for uniform regions. So that gradient on depth map is penalized for uniform regions
- Second term: data term defined by the photometric error
- Huber norm: differentiable replacement to L1 norm that better preserve
discontinuities compared to L2
Energy Functional
● Regularised cost
Huber norm
Weight
Regularization term Photometric cost term
Total Variation(TV) Regularization
● L1 penalization of gradient magnitudes
- Favors sparse, piecewise-constant solutions
- Allows sharp discontinuities in the solution
● Problem
- Staircasing
- Can be reduced by using quadratic penalization
for small gradient magnitudes
Energy Functional Analysis
Convex Not convex
Energy Minimization
● Composition of both terms is non-convex fuction
● Possible solution
- Linearize the cost volume to get a convex approximation of the
data term
- Solve approximation iteratively within coarse-to-fine warping
scheme
* Can lead in loss of the reconstruction details
● Better solution?
Key observation
● Data term can be globally optimized by
exhaustive search(point-wise optimization)
● Convex regularization term can be solved
efficiently by convex optimization algorithms
● And we can approximate the energy functional
by decoupling data and regularity term following
the approach described in [1][2]
[1]F.Steinbrucker et.al: Large displacement optical flow computation without warping
[2]A.Chambolle et.al: An Algorithm for Total Variation Minimization and Applications
Alternating two Global Optimizations
*Drives original and aux. Variables together
*Minimizing functional above equivalent to minimizing original formulation as θ -> 0
*Data and regularity terms are decoupled via aux. Variable α
*Optimization process is split into two sub-prolems
α=Ω→ℝ
Alternating two Global Optimizations
●
Energy functional can be globally minimized w.r.t ξ
* Since it is convex in ξ
* E.g. gradient descent
●
Energy functional can be globally minimized w.r.t α
* Not convex w.r.t α, but trivially point-wise optimizable
* Exhaustive search
Algorithm
● Initialization
- Compute = =
- θ = large_value
● Until >
- Compute
* Minimize with fixed α
* Use convex optimization tools, e.g. gradient descent
- Compute
* Minize with fixed ξ
* Exhaustive search
– Decrement θ
θn
θend
αu
0
ξu
0
mind C(u,d)
ξu
n
αu
n
Eξ, α
Eξ, α
Even better
● Problem
- optimization badly conditioned as (uniform regions)
- expensive when doing exhaustive search
- accuracy is not good enough
● Solution
- Primal-Dual approach for convex optimization step
- Acceleration of non-convex search
- Sub-pixel accuracy
∇u→0
Primal-Dual Approach
● General class of energy minimization problems:
● Can obtain dual form by replacing F(Kx) by its convex conjugate
F*(y)
● Use duality principles to arrive at the primal-dual form of
following [1][2][3]
* Usually regularization term
* Often a norm:||Kx|| * Data term
g(u)‖∇ξ(u)‖ϵ+Q(u)
[1] J.-F. Aujol. Some first-order algorithms for total variation based image restoration
[2] A. Chambolle and T. Pock. A first-order primal-dual algorithm for convex problems
with applications to imaging
[3] M.Zhu. Fast numerical algorithms for total variation based image restoration
Primal-Dual Approach
● General problem formulation:
● By definition(Legendre-Fenchel transform):
● Dual Form(Saddle-point problem):
Primal-Dual Approach
● Conjugate of Huber norm(obtained via
Legendre-Fenchel transform)
Minimization
● Solving a saddle point problem now!
● Condition of optimality met when
● Compute partial derivatives
-
-
● Perform gradient descent
- Ascent on y(maximization)
- Descent on x(minimization)
Discretisation
● First some notation:
- Cost volume is discretized in M X N X S array
* M X N ... reference image resolution
* S ... number of points linearly sampling the inverse depth range
- Use MN X 1 stacked rasterised column vector
* d ... vector version of ξ
* a ... vector version of α
* g ... MN X 1 vector with per-pixel weights
* G=diag(g) ... element-wise weighting matrix
- Ad computes 2MN X 1 gradient vector
Implementation
●
Replace the weighted Huber regularizer by its conjugate
●
Saddle-point problem
- Primal variable d and dual variable q
- Coupled with data term
* Sum of convex and non-convex functions
F(AGd) F*(q)
F*(q)
Algorithm
● Compute partial derivatives
-
-
● For fixed a, gradient ascent w.r.t q and gradient descent
w.r.t d is performed
● For fixed d, exhaustive search w.r.t a is performed
● is decremented
● Until >
θ
θn
θend
[1] A. Chambolle and T. Pock. A first-order primal-dual algorithm for convex problems
with applications to imaging
Outline
➢ Introduction
➢ Related Work
➢ System Overview
➢ Dense Mapping
➢ Dense Tracking
➢ Evaluation and Results
➢ Conclusions and Future Work
Dense Tracking
● Inputs:
- 3D texture model of the scene
- Pose at previous frame
● Tracking as a registration problem
- First inter-frame rotation estimation: the previous image is aligned
on the current image to estimate a coarse inter-frame rotation
- Estimated pose is used to project the 3D model into 2.5D image
- The 2.5D image is registered with the current frame to find the
current pose
Template matching problem
Tracking Strategy and Algorithm
● Based on image alignment against dense model
● Coarse-to-fine strategy
- Pyramid hierarchy of images
● Lucas-Kanade algorithm
- Estimate “warp” between images
- Iterative minimization of a cost function
- Parameters of warp correspond to dimensionality of search space
Tracking in Two Stages
●
Two stages
- Constrained rotation estimation
* Use coarser scales
* Rough estimate of pose
- Accurate 6-DOF pose refinement
* Set virtual camera at location
Project dense model to the virtual camera
Image ,inverse depth image
* Align live image and to estimate
* Final pose estimation
T
^
wl
ν Tw ν=T
^
wl
Iν ξν
Il Iν
Tlv
Twl=Tw νTlν
SSD optimization
● Problem:
- Align template image T(x) with input image I(x)
● Formulation:
- Find the transform that best maps the pixels of the templates into
the ones of the current image minimizing:
- are the displacement parameters to be optimized
● Hypothesis:
- Known a coarse approximation of the template position
Σ
x
[I(W(x;p))−T(x)]2
W(x; p)
p=( p1 ,..., pn)T
(p0)
SSD optimization
● Problem:
- minimize
- The current estimation of p is iteratively updated to reach the minimum of the function.
● Formulations:
- Direct additional
- Direct compositional
-Inverse
Σ
x
[I(W(x;p))−T(x)]2
Σ
x
[I(W(x;p+Δ p))−T(x)]2
Σ
x
[I(W(W(x;Δ p);p))−T(x)]
2
Σ
x
[I(W(x;Δ p))−I(W(x;p))]
2
SSD optimization
●
Example: Direct additive method
- Minimize:
- First order Taylor expansion:
- Solution:
- with:
Σ
x
[I(W(x;p+Δ p))−T(x)]2
Σ
x
[I(W(x;p))+∇ I
∂W
∂p
Δ p−T(x)]
2
Δ p=Σ
x
H
−1
[∇ I
∂W
∂p
]
T
[T(x)−I(W(x;p))]
H=Σ
x
[∇ I
∂W
∂p
]
T
[∇ I
∂W
∂p
]
SSD robustified
● Formulation:
● Problem: In case of occlusion, the occluded pixels
cause the optimum of the function to be changed.
The occluded pixels have to be ignored from the optimization
● Method
- Only the pixels with a difference
lower than a threshold are selected
- Threshold is iteratively updated to get more
selective as the optimization reaches the optimum
Δ p=Σ
x
H
−1
[∇ I
∂W
∂p
]
T
[T(x)−I(W(x;p))]
[T (x)−I (W (x; p))]
Template matching in DTAM
● Inter-frame rotation estimation
- the template is the previous image that is matched with
current image. Warp is defined on . The initial
estimate of p is identity.
● Full pose estimation
- template is 2.5D, warp is defined by full 3D motion
estimation, which is on
- The initial pose is given by the pose estimated at the
previous frame and the inter-frame rotation estimation
SO(3)
SE(3)
6 DOF Image Alignment
● Gauss-Newton gradient descent non-linear optimization
● Non-linear expression linearized by first-order Taylor
expansion
F(ψ)=
1
2
Σ
u∈Ω
(fu(ψ))
2
fu(ψ)=Il(π(KTlν(ψ)π−1
(u,ξυ(u))))−Iυ(u)
Tlv (ψ)=exp( Σ
i=1
6
ψi gen
SE(3)
i)
ψ∈R6
BelongstoLieAlgebraςϱ3
Outline
➢ Introduction
➢ Related Work
➢ System Overview
➢ Dense Mapping
➢ Dense Tracking
➢ Evaluation and Results
➢ Conclusions and Future Work
Evaluation and Results
● Runs in real-time
- NVIDIA GTX 480 GPU
- i7 quad-core GPU
- Grey Flea2 camera
* Resolution 640*480
* 30Hz
● Comparison with PTAM
- a challenging high acceleration
back-and-forth trajectory close to a cup
- with DTAM's relocaliser disabled
Evaluation and Results
● Unmodelled objects
● Camera defocus
Outline
➢ Introduction
➢ Related Work
➢ System Overview
➢ Dense Mapping
➢ Dense Tracking
➢ Evaluation and Results
➢ Conclusions and Future Work
Conclusions
● First live full dense reconstruction system
● Significant advance in real-time geometrical vision
● Robust
- rapid motion
- cemera defocus
● Dense modelling and dense tracking make the
system beat any point-based method with
modelling and tracking performance
Future Work
● Short comings
- Brightness constancy assumption
* often violated in real-world
* not robust to global illumination changes
- Smoothness assumption on depth
● Possible solutions
- integrate a normalized cross correlation measure into the objective
function for more robustness to local and global lighting changes
- joint modelling of the dense lighting and reflectance properties of the
scene to enable moe accurate photometric cost functions(the authors
are more interested in this approach)
Thank You!

More Related Content

PDF
【メタサーベイ】Neural Fields
PDF
最近のDeep Learning (NLP) 界隈におけるAttention事情
PDF
[DL輪読会]NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
PPTX
Diabetes Mellitus
PPTX
Hypertension
PPTX
Republic Act No. 11313 Safe Spaces Act (Bawal Bastos Law).pptx
PPTX
Power Point Presentation on Artificial Intelligence
【メタサーベイ】Neural Fields
最近のDeep Learning (NLP) 界隈におけるAttention事情
[DL輪読会]NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
Diabetes Mellitus
Hypertension
Republic Act No. 11313 Safe Spaces Act (Bawal Bastos Law).pptx
Power Point Presentation on Artificial Intelligence

What's hot (20)

PDF
ORB SLAM Proposal for NTU GPU Programming Course 2016
PPTX
【DL輪読会】Variable Bitrate Neural Fields
PDF
コンピューテーショナルフォトグラフィ
PDF
30th コンピュータビジョン勉強会@関東 DynamicFusion
PPTX
3D Gaussian Splatting
PDF
3D SLAM introcution& current status
PPTX
[DL輪読会] MoCoGAN: Decomposing Motion and Content for Video Generation
PDF
[論文紹介] DPSNet: End-to-end Deep Plane Sweep Stereo
PDF
[DL輪読会] Residual Attention Network for Image Classification
PDF
Open3DでSLAM入門 PyCon Kyushu 2018
PPT
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...
PDF
20180925 CV勉強会 SfM解説
PPTX
Deep Learningで似た画像を見つける技術 | OHS勉強会#5
PDF
【DL輪読会】Vision-Centric BEV Perception: A Survey
PPTX
SuperGlue; Learning Feature Matching with Graph Neural Networks (CVPR'20)
PDF
SLAM Zero to One
PDF
CVPR2019 読み会「Understanding the Limitations of CNN-based Absolute Camera Pose ...
PDF
[DL Hacks]Semi-Supervised Classification with Graph Convolutional Networks
PDF
SSII2019企画: 画像および LiDAR を用いた自動走行に関する動向
PPTX
【DL輪読会】論文解説:Offline Reinforcement Learning as One Big Sequence Modeling Problem
ORB SLAM Proposal for NTU GPU Programming Course 2016
【DL輪読会】Variable Bitrate Neural Fields
コンピューテーショナルフォトグラフィ
30th コンピュータビジョン勉強会@関東 DynamicFusion
3D Gaussian Splatting
3D SLAM introcution& current status
[DL輪読会] MoCoGAN: Decomposing Motion and Content for Video Generation
[論文紹介] DPSNet: End-to-end Deep Plane Sweep Stereo
[DL輪読会] Residual Attention Network for Image Classification
Open3DでSLAM入門 PyCon Kyushu 2018
Frostbite Rendering Architecture and Real-time Procedural Shading & Texturing...
20180925 CV勉強会 SfM解説
Deep Learningで似た画像を見つける技術 | OHS勉強会#5
【DL輪読会】Vision-Centric BEV Perception: A Survey
SuperGlue; Learning Feature Matching with Graph Neural Networks (CVPR'20)
SLAM Zero to One
CVPR2019 読み会「Understanding the Limitations of CNN-based Absolute Camera Pose ...
[DL Hacks]Semi-Supervised Classification with Graph Convolutional Networks
SSII2019企画: 画像および LiDAR を用いた自動走行に関する動向
【DL輪読会】論文解説:Offline Reinforcement Learning as One Big Sequence Modeling Problem
Ad

Viewers also liked (20)

PDF
第2回cv勉強会@九州 LSD-SLAM
PPTX
関東コンピュータビジョン勉強会
PDF
Multiple View Geometry - Estimation (Direct Linear Transformation)
PPTX
Simultaneous Localization and Mapping for Pedestrians using Distortions of th...
PPT
Section 2.7 square roots (algebra)
PPT
Section 3.3 the real number system (math)
PPT
Rational irrational and_real_number_practice
PDF
Large-Scale Object Classification Using Label Relation Graphs
ODP
Introduction to "Facial Landmark Detection by Deep Multi-task Learning"
PDF
Iaetsd deblurring of noisy or blurred
PDF
(DL Hacks輪読) How transferable are features in deep neural networks?
PDF
Deblurring in ct
PDF
image-deblurring
PDF
[G4]image deblurring, seeing the invisible
PDF
(DL hacks輪読) Variational Dropout and the Local Reparameterization Trick
PDF
(DL hacks輪読) Variational Inference with Rényi Divergence
PDF
(DL hacks輪読) Difference Target Propagation
PDF
Deblurring of Digital Image PPT
PDF
Deeplearning4.4 takmin
第2回cv勉強会@九州 LSD-SLAM
関東コンピュータビジョン勉強会
Multiple View Geometry - Estimation (Direct Linear Transformation)
Simultaneous Localization and Mapping for Pedestrians using Distortions of th...
Section 2.7 square roots (algebra)
Section 3.3 the real number system (math)
Rational irrational and_real_number_practice
Large-Scale Object Classification Using Label Relation Graphs
Introduction to "Facial Landmark Detection by Deep Multi-task Learning"
Iaetsd deblurring of noisy or blurred
(DL Hacks輪読) How transferable are features in deep neural networks?
Deblurring in ct
image-deblurring
[G4]image deblurring, seeing the invisible
(DL hacks輪読) Variational Dropout and the Local Reparameterization Trick
(DL hacks輪読) Variational Inference with Rényi Divergence
(DL hacks輪読) Difference Target Propagation
Deblurring of Digital Image PPT
Deeplearning4.4 takmin
Ad

Similar to DTAM: Dense Tracking and Mapping in Real-Time, Robot vision Group (20)

PDF
Oc2423022305
PPTX
Dense Motion Estimation for Computer Vision
PPTX
Image enhancement
PDF
PDF
Building 3D Morphable Models from 2D Images
PPTX
Kintinuous review
PPTX
Real Time Stitching Of IR Images using ml.pptx
PPTX
Computer Vision panoramas
PDF
PPT s07-machine vision-s2
PPTX
Single Image Depth Estimation using frequency domain analysis and Deep learning
PDF
06466595
PPTX
image enhancement-POINT AND HISTOGRAM PROCESSING.pptx
PPTX
Flash Photography and toonification
PDF
426 Lecture5: AR Registration
PDF
Median based parallel steering kernel regression for image reconstruction
PDF
MEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTION
PDF
MEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTION
PPTX
motion and feature based person tracking in survillance videos
PDF
3D Reconstruction from Multiple uncalibrated 2D Images of an Object
PDF
robio-2014-falquez
Oc2423022305
Dense Motion Estimation for Computer Vision
Image enhancement
Building 3D Morphable Models from 2D Images
Kintinuous review
Real Time Stitching Of IR Images using ml.pptx
Computer Vision panoramas
PPT s07-machine vision-s2
Single Image Depth Estimation using frequency domain analysis and Deep learning
06466595
image enhancement-POINT AND HISTOGRAM PROCESSING.pptx
Flash Photography and toonification
426 Lecture5: AR Registration
Median based parallel steering kernel regression for image reconstruction
MEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTION
MEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTION
motion and feature based person tracking in survillance videos
3D Reconstruction from Multiple uncalibrated 2D Images of an Object
robio-2014-falquez

More from Lihang Li (8)

PPTX
Something about SSE and beyond
PDF
Some experiences and lessons learnt from hunting a job
PDF
Getting started with Linux and Python by Caffe
PDF
Point cloud mesh-investigation_report-lihang
PDF
Rtabmap investigation report-lihang
PDF
Rgbdslam and mapping_investigation_report-lihang
PDF
2013新人见面会-中科院开源软件协会介绍-hustcalm
PDF
像Hackers一样写博客-hustcalm
Something about SSE and beyond
Some experiences and lessons learnt from hunting a job
Getting started with Linux and Python by Caffe
Point cloud mesh-investigation_report-lihang
Rtabmap investigation report-lihang
Rgbdslam and mapping_investigation_report-lihang
2013新人见面会-中科院开源软件协会介绍-hustcalm
像Hackers一样写博客-hustcalm

Recently uploaded (20)

PPT
protein biochemistry.ppt for university classes
PPTX
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
PPTX
Introduction to Cardiovascular system_structure and functions-1
PPTX
Cell Membrane: Structure, Composition & Functions
PDF
The scientific heritage No 166 (166) (2025)
PPTX
neck nodes and dissection types and lymph nodes levels
PDF
HPLC-PPT.docx high performance liquid chromatography
PDF
CAPERS-LRD-z9:AGas-enshroudedLittleRedDotHostingaBroad-lineActive GalacticNuc...
PPTX
Microbiology with diagram medical studies .pptx
PDF
Phytochemical Investigation of Miliusa longipes.pdf
PPT
POSITIONING IN OPERATION THEATRE ROOM.ppt
PPTX
2. Earth - The Living Planet earth and life
PPTX
Introduction to Fisheries Biotechnology_Lesson 1.pptx
PDF
Placing the Near-Earth Object Impact Probability in Context
PDF
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
PPTX
microscope-Lecturecjchchchchcuvuvhc.pptx
PDF
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
PDF
IFIT3 RNA-binding activity primores influenza A viruz infection and translati...
PPTX
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
PPTX
2Systematics of Living Organisms t-.pptx
protein biochemistry.ppt for university classes
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
Introduction to Cardiovascular system_structure and functions-1
Cell Membrane: Structure, Composition & Functions
The scientific heritage No 166 (166) (2025)
neck nodes and dissection types and lymph nodes levels
HPLC-PPT.docx high performance liquid chromatography
CAPERS-LRD-z9:AGas-enshroudedLittleRedDotHostingaBroad-lineActive GalacticNuc...
Microbiology with diagram medical studies .pptx
Phytochemical Investigation of Miliusa longipes.pdf
POSITIONING IN OPERATION THEATRE ROOM.ppt
2. Earth - The Living Planet earth and life
Introduction to Fisheries Biotechnology_Lesson 1.pptx
Placing the Near-Earth Object Impact Probability in Context
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
microscope-Lecturecjchchchchcuvuvhc.pptx
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
IFIT3 RNA-binding activity primores influenza A viruz infection and translati...
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
2Systematics of Living Organisms t-.pptx

DTAM: Dense Tracking and Mapping in Real-Time, Robot vision Group

  • 1. DTAM:Dense Tracking and Mapping in Real-Time NewCombe, Lovegrove & Davision ICCV11
  • 2. Outline ➢ Introduction ➢ Related Work ➢ System Overview ➢ Dense Mapping ➢ Dense Tracking ➢ Evaluation and Results ➢ Conclusions and Future Work
  • 3. Outline ➢ Introduction ➢ Related Work ➢ System Overview ➢ Dense Mapping ➢ Dense Tracking ➢ Evaluation and Results ➢ Conclusions and Future Work
  • 4. Introduction ● Dense Tracking and Mapping in Real-Time ● DTAM is a system for real-time camera tracking and reconstruction which relies not on feature extraction but dense, every pixel methods. ● Simultaneous frame-rate Tracking and Dense Mapping
  • 5. Outline ➢ Introduction ➢ Related Work ➢ System Overview ➢ Dense Mapping ➢ Dense Tracking ➢ Evaluation and Results ➢ Conclusions and Future Work
  • 6. Related Work ● Real-time SFM(Structure from Motion) ● PTAM(G. Klein and D. W. Murray. ISMAR 2007) ● Improving the agility of keyframe-based SLAM(G. Klein and D. W. Murray. ECCV 2008) ● Live dense reconstruction with a single moving camera(R. A. Newcombe and A. J. Davison. CVPR 2010) ● Real-time dense geometry from a handheld camera(J. Stuehmer et.al. 2010)
  • 7. Outline ➢ Introduction ➢ Related Work ➢ System Overview ➢ Dense Mapping ➢ Dense Tracking ➢ Evaluation and Results ➢ Conclusions and Future Work
  • 8. System Overview ● Input -Single hand held RGB Camera ● ● Objective: -Dense Mapping -Dense Tracking Input Imgage 3D Dense Map
  • 10. Outline ➢ Introduction ➢ Related Work ➢ System Overview ➢ Dense Mapping ➢ Dense Tracking ➢ Evaluation and Results ➢ Conclusions and Future Work
  • 11. Dense Mapping ● Estimate inverse depth map from bundles of images
  • 12. Photometric error ● Total cost ● Photometric error ● Where: - K ...intrinsic matrix - Tmr ...transformation from m to r - -
  • 13. Depth map estimation ● Principle: - S depth hypothesis are considered for each pixel of the reference image Ir - Each corresponding 3D point is projected onto a bundle of images Im - Keep the depth hypothesis that best respects the color consistency from the reference to the bundle of images ● Formulation: - :pixel position and depth hypothesis - :number of valid reprojection of the pixel in the bundle - :photometric error between reference and current image
  • 14. Depth map estimation Example reference image pixel Reprojection of depth Hypotheses on one image of bundle Rerojectionin ImagebundlePhotomtricerror Depth Hypotheses
  • 15. Inverse Depth Map Computation ● Inverse depth map can be computed by minimizing the photometric error(exhaustive search ove the volume): ● But featureless regions are prone to false minima
  • 16. Inverse Depth Map Computation
  • 17. Depth map filtering approach ● Problem: - Uniform regions in reference image do not give discriminative enough photometric error ● Idea: - Assume the depth is smooth on uniform regions - Use total variational approach where depth map is the functional to optimize: *photometric error defines the data term *the smoothness constraint defines the regularization
  • 18. Inverse Depth Map Computation ● Featureless regions are prone to false minima ● Solution:Regularization term - We want to penalize deviation from spatially smooth solution - But preserve edges and discontinuities
  • 19. Depth map filtering approach ● Formulation of the variational approach - First term: regularization constraint, g is defined as 0 for image gradients and 1 for uniform regions. So that gradient on depth map is penalized for uniform regions - Second term: data term defined by the photometric error - Huber norm: differentiable replacement to L1 norm that better preserve discontinuities compared to L2
  • 20. Energy Functional ● Regularised cost Huber norm Weight Regularization term Photometric cost term
  • 21. Total Variation(TV) Regularization ● L1 penalization of gradient magnitudes - Favors sparse, piecewise-constant solutions - Allows sharp discontinuities in the solution ● Problem - Staircasing - Can be reduced by using quadratic penalization for small gradient magnitudes
  • 23. Energy Minimization ● Composition of both terms is non-convex fuction ● Possible solution - Linearize the cost volume to get a convex approximation of the data term - Solve approximation iteratively within coarse-to-fine warping scheme * Can lead in loss of the reconstruction details ● Better solution?
  • 24. Key observation ● Data term can be globally optimized by exhaustive search(point-wise optimization) ● Convex regularization term can be solved efficiently by convex optimization algorithms ● And we can approximate the energy functional by decoupling data and regularity term following the approach described in [1][2] [1]F.Steinbrucker et.al: Large displacement optical flow computation without warping [2]A.Chambolle et.al: An Algorithm for Total Variation Minimization and Applications
  • 25. Alternating two Global Optimizations *Drives original and aux. Variables together *Minimizing functional above equivalent to minimizing original formulation as θ -> 0 *Data and regularity terms are decoupled via aux. Variable α *Optimization process is split into two sub-prolems α=Ω→ℝ
  • 26. Alternating two Global Optimizations ● Energy functional can be globally minimized w.r.t ξ * Since it is convex in ξ * E.g. gradient descent ● Energy functional can be globally minimized w.r.t α * Not convex w.r.t α, but trivially point-wise optimizable * Exhaustive search
  • 27. Algorithm ● Initialization - Compute = = - θ = large_value ● Until > - Compute * Minimize with fixed α * Use convex optimization tools, e.g. gradient descent - Compute * Minize with fixed ξ * Exhaustive search – Decrement θ θn θend αu 0 ξu 0 mind C(u,d) ξu n αu n Eξ, α Eξ, α
  • 28. Even better ● Problem - optimization badly conditioned as (uniform regions) - expensive when doing exhaustive search - accuracy is not good enough ● Solution - Primal-Dual approach for convex optimization step - Acceleration of non-convex search - Sub-pixel accuracy ∇u→0
  • 29. Primal-Dual Approach ● General class of energy minimization problems: ● Can obtain dual form by replacing F(Kx) by its convex conjugate F*(y) ● Use duality principles to arrive at the primal-dual form of following [1][2][3] * Usually regularization term * Often a norm:||Kx|| * Data term g(u)‖∇ξ(u)‖ϵ+Q(u) [1] J.-F. Aujol. Some first-order algorithms for total variation based image restoration [2] A. Chambolle and T. Pock. A first-order primal-dual algorithm for convex problems with applications to imaging [3] M.Zhu. Fast numerical algorithms for total variation based image restoration
  • 30. Primal-Dual Approach ● General problem formulation: ● By definition(Legendre-Fenchel transform): ● Dual Form(Saddle-point problem):
  • 31. Primal-Dual Approach ● Conjugate of Huber norm(obtained via Legendre-Fenchel transform)
  • 32. Minimization ● Solving a saddle point problem now! ● Condition of optimality met when ● Compute partial derivatives - - ● Perform gradient descent - Ascent on y(maximization) - Descent on x(minimization)
  • 33. Discretisation ● First some notation: - Cost volume is discretized in M X N X S array * M X N ... reference image resolution * S ... number of points linearly sampling the inverse depth range - Use MN X 1 stacked rasterised column vector * d ... vector version of ξ * a ... vector version of α * g ... MN X 1 vector with per-pixel weights * G=diag(g) ... element-wise weighting matrix - Ad computes 2MN X 1 gradient vector
  • 34. Implementation ● Replace the weighted Huber regularizer by its conjugate ● Saddle-point problem - Primal variable d and dual variable q - Coupled with data term * Sum of convex and non-convex functions F(AGd) F*(q) F*(q)
  • 35. Algorithm ● Compute partial derivatives - - ● For fixed a, gradient ascent w.r.t q and gradient descent w.r.t d is performed ● For fixed d, exhaustive search w.r.t a is performed ● is decremented ● Until > θ θn θend [1] A. Chambolle and T. Pock. A first-order primal-dual algorithm for convex problems with applications to imaging
  • 36. Outline ➢ Introduction ➢ Related Work ➢ System Overview ➢ Dense Mapping ➢ Dense Tracking ➢ Evaluation and Results ➢ Conclusions and Future Work
  • 37. Dense Tracking ● Inputs: - 3D texture model of the scene - Pose at previous frame ● Tracking as a registration problem - First inter-frame rotation estimation: the previous image is aligned on the current image to estimate a coarse inter-frame rotation - Estimated pose is used to project the 3D model into 2.5D image - The 2.5D image is registered with the current frame to find the current pose Template matching problem
  • 38. Tracking Strategy and Algorithm ● Based on image alignment against dense model ● Coarse-to-fine strategy - Pyramid hierarchy of images ● Lucas-Kanade algorithm - Estimate “warp” between images - Iterative minimization of a cost function - Parameters of warp correspond to dimensionality of search space
  • 39. Tracking in Two Stages ● Two stages - Constrained rotation estimation * Use coarser scales * Rough estimate of pose - Accurate 6-DOF pose refinement * Set virtual camera at location Project dense model to the virtual camera Image ,inverse depth image * Align live image and to estimate * Final pose estimation T ^ wl ν Tw ν=T ^ wl Iν ξν Il Iν Tlv Twl=Tw νTlν
  • 40. SSD optimization ● Problem: - Align template image T(x) with input image I(x) ● Formulation: - Find the transform that best maps the pixels of the templates into the ones of the current image minimizing: - are the displacement parameters to be optimized ● Hypothesis: - Known a coarse approximation of the template position Σ x [I(W(x;p))−T(x)]2 W(x; p) p=( p1 ,..., pn)T (p0)
  • 41. SSD optimization ● Problem: - minimize - The current estimation of p is iteratively updated to reach the minimum of the function. ● Formulations: - Direct additional - Direct compositional -Inverse Σ x [I(W(x;p))−T(x)]2 Σ x [I(W(x;p+Δ p))−T(x)]2 Σ x [I(W(W(x;Δ p);p))−T(x)] 2 Σ x [I(W(x;Δ p))−I(W(x;p))] 2
  • 42. SSD optimization ● Example: Direct additive method - Minimize: - First order Taylor expansion: - Solution: - with: Σ x [I(W(x;p+Δ p))−T(x)]2 Σ x [I(W(x;p))+∇ I ∂W ∂p Δ p−T(x)] 2 Δ p=Σ x H −1 [∇ I ∂W ∂p ] T [T(x)−I(W(x;p))] H=Σ x [∇ I ∂W ∂p ] T [∇ I ∂W ∂p ]
  • 43. SSD robustified ● Formulation: ● Problem: In case of occlusion, the occluded pixels cause the optimum of the function to be changed. The occluded pixels have to be ignored from the optimization ● Method - Only the pixels with a difference lower than a threshold are selected - Threshold is iteratively updated to get more selective as the optimization reaches the optimum Δ p=Σ x H −1 [∇ I ∂W ∂p ] T [T(x)−I(W(x;p))] [T (x)−I (W (x; p))]
  • 44. Template matching in DTAM ● Inter-frame rotation estimation - the template is the previous image that is matched with current image. Warp is defined on . The initial estimate of p is identity. ● Full pose estimation - template is 2.5D, warp is defined by full 3D motion estimation, which is on - The initial pose is given by the pose estimated at the previous frame and the inter-frame rotation estimation SO(3) SE(3)
  • 45. 6 DOF Image Alignment ● Gauss-Newton gradient descent non-linear optimization ● Non-linear expression linearized by first-order Taylor expansion F(ψ)= 1 2 Σ u∈Ω (fu(ψ)) 2 fu(ψ)=Il(π(KTlν(ψ)π−1 (u,ξυ(u))))−Iυ(u) Tlv (ψ)=exp( Σ i=1 6 ψi gen SE(3) i) ψ∈R6 BelongstoLieAlgebraςϱ3
  • 46. Outline ➢ Introduction ➢ Related Work ➢ System Overview ➢ Dense Mapping ➢ Dense Tracking ➢ Evaluation and Results ➢ Conclusions and Future Work
  • 47. Evaluation and Results ● Runs in real-time - NVIDIA GTX 480 GPU - i7 quad-core GPU - Grey Flea2 camera * Resolution 640*480 * 30Hz ● Comparison with PTAM - a challenging high acceleration back-and-forth trajectory close to a cup - with DTAM's relocaliser disabled
  • 48. Evaluation and Results ● Unmodelled objects ● Camera defocus
  • 49. Outline ➢ Introduction ➢ Related Work ➢ System Overview ➢ Dense Mapping ➢ Dense Tracking ➢ Evaluation and Results ➢ Conclusions and Future Work
  • 50. Conclusions ● First live full dense reconstruction system ● Significant advance in real-time geometrical vision ● Robust - rapid motion - cemera defocus ● Dense modelling and dense tracking make the system beat any point-based method with modelling and tracking performance
  • 51. Future Work ● Short comings - Brightness constancy assumption * often violated in real-world * not robust to global illumination changes - Smoothness assumption on depth ● Possible solutions - integrate a normalized cross correlation measure into the objective function for more robustness to local and global lighting changes - joint modelling of the dense lighting and reflectance properties of the scene to enable moe accurate photometric cost functions(the authors are more interested in this approach)