Introduction to Binocular Stereo in Computer Vision

Binocular Stereo
CS5670: Computer Vision
Single image stereogram,
https://en.wikipedia.org/wiki/Autostereogram
What is this?

Announcements
• Project 3 due tomorrow, Friday, March 18 at 8pm (code),
Monday, March 21 at 8pm (artifact)
• Project 4 (Stereo) to be released on Tuesday, March 22,
due Friday, April 1, by 8pm
– To be done in groups of two

“Mark Twain at Pool Table", no date, UCR Museum of Photography

https://giphy.com/gifs/wigglegram-706pNfSKyaDug

• An object point will project to
some point in our image
• That image point corresponds
to a ray in the world
• Two rays intersect at a single
point, so if we want to localize
points in 3D we need 2 eyes
Stereo Vision as
Localizing Points in 3D

Stereo
• Given two images from different viewpoints
– How can we compute the depth of each point in the image?
– Based on how much each pixel moves between the two
images

epipolar
lines
Epipolar geometry
(x1, y1) (x2, y1)
x2 - x1 = the disparity of pixel (x1, y1)
Two images captured by a purely horizontal translating camera
(rectified stereo pair)

Disparity = inverse depth
http://stereo.nypl.org/view/41729
(Or, hold a finger in front of your face and wink each eye in succession.)

Your basic stereo matching algorithm
• Match Pixels in Conjugate Epipolar Lines
– Assume brightness constancy
– This is a challenging problem
– Hundreds of approaches
• A good survey and evaluation: http://www.middlebury.edu/stereo/

Your basic stereo matching algorithm
For each epipolar line
For each pixel in the left image
• compare with every pixel on same epipolar line in right image
• pick pixel with minimum match cost
Improvement: match windows

Stereo matching based on SSD
SSD
dmin d
Best matching disparity

Window size
– Smaller window
+ more detail
- more noise
– Larger window
+ less noise
- less detail
W = 3 W = 20
Better results with adaptive window
• T. Kanade and M. Okutomi, A Stereo Matching Algorithm with an
Adaptive Window: Theory and Experiment, ICRA 1991.
• D. Scharstein and R. Szeliski. Stereo matching with nonlinear
diffusion. IJCV, July 1998
Effect of window size

Stereo results
– Data from University of Tsukuba
– Similar results on other images without ground truth
Ground truth
Scene

Results with window search
Window-based matching
(best window size)
Ground truth

Better methods exist...
Graph cuts-based method
Boykov et al., Fast Approximate Energy Minimization via Graph Cuts,
International Conference on Computer Vision 1999.
Ground truth
For the latest and greatest: http://www.middlebury.edu/stereo/

Stereo as energy minimization
• What defines a good stereo correspondence?
1. Match quality
• Want each pixel to find a good match in the other image
2. Smoothness
• If two pixels are adjacent, they should (usually) move about the same amount

• Find disparity map d that minimizes an energy
function
• Simple pixel / window matching
SSD distance between windows
I(x, y) and J(x + d(x,y), y)
=

y = 141
C(x, y, d); the disparity space image (DSI)
x
d

y = 141
x
d
Simple pixel / window matching: choose the minimum of each
column in the DSI independently:

Greedy selection of best match

• Better objective function
{
{
match cost smoothness cost
Want each pixel to find a good
match in the other image
Adjacent pixels should (usually)
move about the same amount

match cost:
smoothness cost:
4-connected
neighborhood
8-connected
neighborhood
: set of neighboring pixels

Smoothness cost
“Potts model”
L1 distance
How do we choose V?

Smoothness cost
• If λ = infinity, then we only consider smoothness
• Optimal solution is a surface of constant depth/disparity
– Fronto-parallel surface
• In practice, want to balance data term with smoothness
term

Dynamic programming
• Can minimize this independently per scanline using
dynamic programming (DP)

Dynamic programming
• Finds “smooth”, low-cost path through DPI from left to right
• Visiting a node incurs its data cost, switching disparities from
one column to the next also incurs a (smoothness) cost
y = 141
x
d

Dynamic programming
• Can we apply this trick in 2D as well?
• No: the shortest path trick only works to find a 1D path
Slide credit: D. Huttenlocher

Stereo as a minimization problem
• The 2D problem has many local minima
– Gradient descent doesn’t work well
• And a large search space
– n x m image w/ k disparities has knm possible solutions
– Finding the global minimum is NP-hard in general
• Good approximations exist (e.g., graph cuts algorithms)

Depth from disparity
f
x x’
baseline
z
C C’
X
f

Real-time stereo
• Used for robot navigation (and other tasks)
– Several real-time stereo techniques have been developed
(most based on simple discrete search)
Nomad robot searches for meteorites in Antartica

• Camera calibration errors
• Poor image resolution
• Occlusions
• Violations of brightness constancy (specular reflections)
• Large motions
• Low-contrast image regions
Stereo reconstruction pipeline
• Steps
– Calibrate cameras
– Rectify images
– Compute disparity
– Estimate depth
What will cause errors?

Active stereo with structured light
• Project “structured” light patterns onto the object
– simplifies the correspondence problem
– basis for active depth sensors, such as Kinect and iPhone X (using
camera 2
camera 1
projector
camera 1
projector
Li Zhang’s one-shot stereo

Active stereo with structured light
https://ios.gadgethacks.com/news/watch-iphone-xs-30k-ir-dots-scan-your-face-0180944/

Laser scanning
• Optical triangulation
– Project a single stripe of laser light
– Scan it across the surface of the object
– This is a very precise version of structured light scanning
Digital Michelangelo Project
http://graphics.stanford.edu/projects/mich/

Laser scanned models
The Digital Michelangelo Project, Levoy et al.

3D Photography on your Desk
http://www.vision.caltech.edu/bouguetj/ICCV98/

Introduction to Binocular Stereo in Computer Vision

More Related Content

Similar to Introduction to Binocular Stereo in Computer Vision (20)

Recently uploaded (20)

Introduction to Binocular Stereo in Computer Vision