LightFields.jl: Fast 3D image reconstruction for VR applications - Hector Andrade Loarca

LightFields.jl: Fast 3D image reconstruction for VR
applications
H´ector Andrade Loarca
Technical University of Berlin, BMS
7th of July, 2018
PyData 2018

Main goal
Present a novel technique to reconstruct the depth map of a scene
from a limited number of views. This can be applied in view synthesis
and rendering for free viewpoint VR.
H´ector Andrade Loarca (TUB) LightFields.jl PyData Berlin 2018 2 / 24

Main goal
Explain the main building blocks of the technique: Light Field and
Shearlets.

Main goal
Explain the main building blocks of the technique: Light Field and
Shearlets.
Show a free hardware/software implementation using julia, python
and Raspberry Pi.

What is a Light Field?
Light can be interpreted as a ﬁeld, i.e. assignment of a vector to each
point in the space (M. Faraday, 1846).

Propagation of light rays in the 3D space is completely described by a
7D continuous function L : R7 −→ R3, L(x, y, z, θ, φ, λ, τ) called the
plenoptic function

Propagation of light rays in the 3D space is completely described by a
7D continuous function L : R7 −→ R3, L(x, y, z, θ, φ, λ, τ) called the
plenoptic function
L can be simpliﬁed to a 4D function L4, called 4D Light Field or
simply Light Field, which quantiﬁes the intensity of static and
monochromatic light rays propagating in half space.

4D Light Field Representation
Figure: Three diﬀerent representation of 4F LF. Left: L4(u, v, φ, θ). Center:
L4(φ1, θ1, φ2, θ2). Right: L4(u, v, s, t).

4D Light Field Representation
Figure: Three diﬀerent representation of 4F LF. Left: L4(u, v, φ, θ). Center:
L4(φ1, θ1, φ2, θ2). Right: L4(u, v, s, t).
Figure: Used representation: ”Two plane parametrization”.

From LF to 3D
Stereo Vision: The human brain generates the 3D depth perception
of its sorroundings by triangulating the points of a scene using the
information coming from both eyes.

From LF to 3D
Epipolar Geometry: Generalization of Stereo Vision with more than
two views, assuming the epipolar constraint.

From LF to 3D
Epipolar Geometry: Generalization of Stereo Vision with more than
two views, assuming the epipolar constraint.
Epipolar Constraint: Analysis of object position while assuming the
knowledge of the camera motion.

Epipolar Plane Images (EPIs) on Straight Line Trajectories

Depth map estimation with EPIs
Point-depth formula: D = h∆X
∆u = h ∆X
u1−u2.
Sampling rate (Nyquist criterion): ∆X ≤ Dmin
h ∆u.

Commercial LF (Epipolar) camera

Our approach: Sub-Nyquist reconstruction via inpainting

(General) Image inpainting
Mathematical formulation
Recover an image f ∈ X from known data:
g = PK (f )
where PK is and orthogonal projection onto the known subspace XK X.

How to inpaint?
Frame
A frame for a Hilbert space X is a collection Ψ = {ψi }i∈I ⊂ X satisfying
A||f ||2 ≤ ||{ f , ψi }i∈I|| 2(I) ≤ B||f ||2 ∀f ∈ X
for some 0 < A ≤ B < ∞.
Sparse Regularization/CS approach (Genzel, Kutyniok, 2014):
” If a signal (image) is sparse within a frame Ψ, it can be recovered from
highly underdetermined, non-adaptive linear measurements by
1-regularization, i.e.
min
˜f ∈X
||{ ˜f , ψi }i∈I|| 1(I) s.t. PK (˜f ) = g = PK (f ) ”

Frames for images and optimal sparsity
Gabor frames (Gabor, 1946).
Wavelet frames (Morlet et al., 1984).
Curvelet frames (Cand`es et al., 1999).
Shearlet frames (Kutyniok et al., 2005).

Best N-term approx. error (Donoho, 2001)
Let {ψλ}λ∈Λ ⊂ L2(R2) a frame. The optimal best N-Term approximation
error for any f ∈ E2(R2) is
σN(f , {ψλ}λ∈Λ) = O(N−1
)

Best N-term approx. error (Donoho, 2001)
Let {ψλ}λ∈Λ ⊂ L2(R2) a frame. The optimal best N-Term approximation
error for any f ∈ E2(R2) is
σN(f , {ψλ}λ∈Λ) = O(N−1
)
Error of 2D-wavelets
σN(f , {ψλ}Λ) ∼ N−1/2

Shearlet Transform (Kutyniok, Guo, Labate, 2005)
Classical Shearlet Transform
f , ψj,k,m =
R2
f (x)ψj,k,m(x)dx
where
SH(ψ) = {ψj,k,m(x) = 23j/4
ψ(SkAj x − m) : (j, k) ∈ Z2
, m ∈ Z2
}

Modiﬁcation: Cone-adapted Shearlet transform
SH(φ, ψ, ˜ψ, c) := PRΦ(φ, c1) ∪ PC1 Ψ(ψ, c) ∪ PC2
˜Ψ( ˜ψ, c)

Modiﬁcation: Cone-adapted Shearlet transform
SH(φ, ψ, ˜ψ, c) := PRΦ(φ, c1) ∪ PC1 Ψ(ψ, c) ∪ PC2
˜Ψ( ˜ψ, c)
Cone shearlets sparsity (Band limited case: Lim, Labate; 2006),
(Compactly supported case: Kutyniok, Lim, 2011)
Best N-term approximation error
σN(f , {ψj,k,m}j,k,m) ∼ N−1
(log(N))3/2

Followed Pipeline

Physical Acquisition Setup
Data set: Sequence of 101 rectiﬁed pictures of a scene generated by
Professor Markus Gross’ group in the Disney Research Center at
Z¨urich.

Physical Acquisition Setup
Data set: Sequence of 101 rectiﬁed pictures of a scene generated by
Professor Markus Gross’ group in the Disney Research Center at
Z¨urich.
Technical details of physical setup: Canon EOS 4D Mark II DSLR
camera, Canon EF 50 mm f/1.4 USM lens and a Zaber T-LST1500D
motorized linear stage to drive the camera to the shooting positions
with 10 mm of distance between each other.

Used Data Set: Church

Point Tracking Results

Example of EPI

Results on EPIs inpainting

Results on line detection and depth map estimation

Open Hardware Implementation
Raspberry π + Camera module v2

Future work

Thanks!
Questions?

LightFields.jl: Fast 3D image reconstruction for VR applications - Hector Andrade Loarca

More Related Content

What's hot (19)

Similar to LightFields.jl: Fast 3D image reconstruction for VR applications - Hector Andrade Loarca (20)

More from PyData (20)

Recently uploaded (20)

LightFields.jl: Fast 3D image reconstruction for VR applications - Hector Andrade Loarca