SlideShare a Scribd company logo
Introduction to
Visual Odometry
Brian Holt
CONTENTS
01 Introduction
02 Imaging with a camera
03 Feature detection: Finding points
04 Tracking and matching: Keeping points
05 Camera motion estimation: Epipolar Geometry
06 RANSAC: Handling noisy correspondences
07 A visual odometry pipeline
Introduction
 odos + metron = rule + measure
 Measure position relative to a
world coordinate frame
 Wheel odometry measures
rotations
 Slippage difficult to account for
 Direction not known
 Visual odometry useful in many
environments
 Opportunity lander
 Mars Rover
 Autonomous vehicles
SfM, SLAM, VO
SfM V-SLAM
Visual
Odometry
SLAM = VO + Loop Closure + BA
Roadmap
Image courtesy: D. Scarramuzza
Why use a camera?
Why use a camera?
 Vast information
 Extremely low Size, Weight, and Power
(SWaP) footprint
 Cheap and easy to use
 Passive sensor
 Processing power is OK today
It’s what nature uses too!
Slide courtesy: S. Weiss
Cellphone type camera, up to
16Mp (480MB/s @ 30Hz)
Cellphone processor unit
1.7GHz quadcore ARM <10g
The Camera is a Bearing Sensor
Slide courtesy: S. Weiss
 Projective sensor which measures the bearing of a
point with respect to the optical axis
 Depth can be inferred by re-observing a point from
different angles
 The movement (i.e. the angle between the observations)
is the point's parallax
 A point at infinity is a feature which exhibits no
parallax during camera motion
 The distance of a star cannot be inferred by moving a
few kilometers
 BUT: it is a perfect bearing reference for attitude
estimation: NASA's star tracker sensors better than 1 arc
second or 0.00027deg
Image Formation
Let’s design a camera
 Idea: Put a piece of film in front of an object
 Do we get a reasonable image?
Slide courtesy: C. Stachniss, S. Seitz
Pinhole Camera
Let’s design a camera
 Add a barrier to block off most of the rays
 This reduces blurring
 The opening is known as the aperture
 How does this transform the image?
Slide courtesy: C. Stachniss, S. Seitz
Pinhole Camera
 Pinhole camera is a simple model to approximate the
imaging process
 If we treat pinhole as a point, only 1 ray from any point
can enter the camera
Slide courtesy: C. Stachniss; Image courtesy: Forsyth and Ponce
Virtual
image
pinhole
Image
plane
Camera Obscura (1544)
Slide courtesy: C. Stachniss; Image courtesy http://www.acmi.net.au
"Reinerus Gemma-Frisius, observed an eclipse of the sun at Louvain on January 24, 1544, and later he
used this illustration of the event in his book De Radio Astronomica et Geometrica, 1545. It is thought to
be the first published illustration of a camera obscura..."
Hammond, John H., The Camera Obscura, A Chronicle
In Latin, means
“dark room”
Pinhole Camera Model
 Similarity of gray triangles
 Image scale
 Mapping
Slide courtesy: C. Stachniss; Image courtesy Foerstner
Pinhole Camera Model
 Small hole: sharp image but requires large exposure times
 Large hole: short exposure times but blurry images
 Solution: replace pinhole with lenses
Slide courtesy: C. Stachniss; Image courtesy Foerstner
Lens Approximates the Pinhole
 A lens is only an approximation of the pinhole camera model
 The corresponding point on the object and in the image and the
centre of the lens should lie on one line
 The further away a ray passes the centre of the lens, the larger the
error
 Use of an aperture to limit the error (trade off between the usable
light and price of the lens)
Slide courtesy: C. Stachniss
Three Assumptions Made in the
Pinhole/Thin Lens Model
1. All rays from the object intersect in a single point
2. All image points lie on a plane
3. The ray from the object point to the image point is a straight line
Often these assumptions do not hold and lead to imperfect
images
Slide courtesy: C. Stachniss; Images courtesy: Wikipedia
Chromatic aberration Coma Astigmatism
Distortion is another common flaw
Slide courtesy: C. Stachniss; Image courtesy: Wikipedia
Deviation from rectilinear projection,
a projection in which straight lines in
a scene remain straight in an image
barrel
distortion
pincushion
distortion
mustache
distortion
Perspective Effects
Objects that are closer appear as larger
Vanishing Points
Parallel lines converge in the image at vanishing points
Vanishing Points
Parallel lines converge in the image at vanishing points
Perspective Projection
Slide courtesy: D. Scaramuzza
Convert point in world coordinates
to point in camera coordinates
Convert point in camera
coordinates to image plane
Convert point in the image plane
to pixel coordinates
3D Point to 2D Pixel
Slide courtesy: D. Scaramuzza
1. Point projects to
2. From similar triangles:
3. Same is true for y
Convert point in camera
coordinates to image plane
2D Plane to 2D Pixel
Slide courtesy: D. Scaramuzza
Convert point in the image plane
to pixel coordinates
1. Account for pixel coordinates of
optical centre
2. Account for scale factors
2D Plane to 2D Pixel
Image courtesy: D. Scaramuzza
Convert point in the image plane
to pixel coordinates
1. Use homogeneous coordinates to
map from 3D to 2D
2. 𝜆 is the scale
2D Plane to 2D Pixel
Image courtesy: D. Scaramuzza
Convert point in the image plane
to pixel coordinates
1. K is a matrix containing focal
lengths and image origin
2. Often called the intrinsic or
calibration matrix
3D Point to 3D Point
Image courtesy: D. Scaramuzza
Convert point in world coordinates
to point in camera coordinates
3D Point to 3D Point
Image courtesy: D. Scaramuzza
Convert point in world coordinates
to point in camera coordinates
Perspective Projection
Image courtesy: D. Scaramuzza
Convert point in world coordinates
to point in camera coordinates
Roadmap
Image courtesy: D. Scarramuzza
Local Image Features
Slide courtesy: D. Scaramuzza
 How would you align these images?
Local Image Features
Slide courtesy: D. Scaramuzza
 Detect point features in both images
Local Image Features
Slide courtesy: D. Scaramuzza
 Detect point features in both images
 Find corresponding pairs
Local Image Features
Slide courtesy: D. Scaramuzza
 Detect point features in both images
 Find corresponding pairs
 Use these pairs to align images
Matching with Features
Slide courtesy: D. Scaramuzza
 Problem 1: Detect the same points independently in both images
No chance of a match!
A repeatable feature detector is required.
Matching with Features
Slide courtesy: D. Scaramuzza
 Problem 2: For each point, find the corresponding point in the
other image
A reliable and distinctive feature descriptor is required that is
invariant to geometric and illumination changes
What is a Distinctive Feature?
Slide courtesy: D. Scaramuzza
 Notice how some patches can be matched with higher accuracy
 Patches with detail are good
Point Features: Corners vs Blobs
Slide courtesy: D. Scaramuzza
Finding and Tracking Points
Image courtesy: OpenCV
Import numpy as np
Import cv2
…
# Take first frame and find corners in it
ret, old_frame = cap.read()
old_gray = cv2.cvtColor(old_frame, cv2.COLOR_BGR2GRAY)
p0 = cv2.goodFeaturesToTrack(old_gray, mask =None, **feature_params)
…
while(1):
ret,frame = cap.read()
frame_gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
# calculate optical flow
p1, st, err = cv2.calcOpticalFlowPyrLK(old_gray, frame_gray, p0,None, **lk_params)
 Lukas-Kanade Tracker uses optical flow
 Assumes pixel intensities of an object do not change between frames
 Assumes neighbouring pixels have similar motion
Finding and Matching Points
Image courtesy: OpenCV
Import numpy as np
Import cv2
…
# Initiate ORB detector
orb = cv.ORB_create()
# find the keypoints and descriptors with ORB
kp1, des1 = orb.detectAndCompute(img1,None)
kp2, des2 = orb.detectAndCompute(img2,None)
…
# create BFMatcher object
bf =cv.BFMatcher(cv.NORM_HAMMING, crossCheck=True)
# Match descriptors.
matches = bf.match(des1,des2)
# Sort them in the order of their distance.
matches = sorted(matches, key =lambdax:x.distance)
# Draw first 10 matches.
img3 =cv.drawMatches(img1,kp1,img2,kp2,matches[:10], flags=2)
 OpenCV support BruteForce and FLANN (approximate NN)
Roadmap
Image courtesy: D. Scarramuzza
2 View Geometry
Image courtesy: Auckland University
 Goal: to estimate 3D scene structure, camera poses (up to scale
factor) and camera intrinsics
2 Cases:
 Calibrated cameras:
K matrices are known
 Uncalibrated cams:
K matrices unknown
𝑅, 𝑇
2 View Geometry
Image courtesy: Auckland University
 Goal: to estimate 3D scene structure, scale and camera intrinsics
 Find 𝑅, 𝑇, 𝑃 𝑖 that satisfy
𝑅, 𝑇
Scale Ambiguity in Monocular Vision
Image courtesy: Amsterdam City Tours
 Rescaling the scene by constant factor results in same image
 We cannot recover the scale!
 Only 5 degrees of freedom: 3 for rotation, 2 for direction of
translation (but no scale)
Calibrated Cameras
 Use normalised image coordinates
 Find 𝑅, 𝑇, 𝑃 𝑖 that satisfy
Essential Matrix
Slide courtesy: D. Scaramuzza
 𝑝𝑙, 𝑝 𝑟, 𝑇 form a
plane (coplanar)
Fundamental Matrix
Slide courtesy: D. Scaramuzza
 𝑝𝑙, 𝑝 𝑟, 𝑇 form a
plane (coplanar)
Recovering Pose
Slide courtesy: S. Weiss
 Recover 𝑅, 𝑇 from
the essential matrix
 Only one solution
where point is in
front of both
cameras
 Apply motion
constancy (Kalman
Filter?)
RANSAC
Slide courtesy: S. Weiss
Assume:
 The model parameters can be estimated from N data items (e.g.
essential matrix from 5-8 points)
 There are M data items in total.
The algorithm:
1. Select N data items at random
2. Estimate parameters (linear or nonlinear least square, or other)
3. Find how many data items (of M) fit the model with parameter
vector within a user given tolerance, T. Call this k. if K is the
largest (best fit) so far, accept it.
Repeat 1. to 4. S times
Fundamental Matrix Song
https://www.youtube.com/watch?v=DgGV3l82NTk
Roadmap
Image courtesy: D. Scarramuzza
Putting It All Together
Import numpy as np
Import cv2
…
while(1):
…
# calculate the essential matrix from matched/ tracked points
E=cv2.findEssentialMat(points2,points1,focal,pp,RANSAC,0.999,1.0,mask)
# Recover R and t from the essential matrix
R, t = cv2.recoverPose(E,points2,points1,focal,pp,mask)
Image courtesy: A. Singh
Putting It All Together
Image courtesy: A. Singh
https://www.youtube.com/watch?v=homos4vd_Zs
Resources
https://avisingh599.github.io/vision/monocular-vo/
http://frc.ri.cmu.edu/~kaess/vslam_cvpr14/media/VSLAM-Tutorial-
CVPR14-A11-VisualOdometry.pdf
Davide Scaramuzza’s home page: http://rpg.ifi.uzh.ch

More Related Content

PDF
“Introduction to Simultaneous Localization and Mapping (SLAM),” a Presentatio...
PPT
Fourier transform
PPTX
Computer Vision - Stereo Vision
PDF
Recursive Neural Networks
PDF
Introduction of slam
PDF
Image enhancement
PPTX
affine-transformations.pptx
“Introduction to Simultaneous Localization and Mapping (SLAM),” a Presentatio...
Fourier transform
Computer Vision - Stereo Vision
Recursive Neural Networks
Introduction of slam
Image enhancement
affine-transformations.pptx

What's hot (20)

PPSX
Design & Implementation of a Cube Satellite
PDF
Unit 6 - Internet and Intranet Systems Development - IT
PPTX
ppt on image processing
PPT
Kalman filters
PPTX
Fourier transform
PPT
1 tracking systems1
PPTX
Object tracking survey
PPTX
Presentation on photonics by prince kushwaha(RJIT)
PPT
Lecture Ch 08
PDF
Fundamentals of radar signal processing mark a. richards
PDF
=SLAM ppt.pdf
PPTX
Popular search algorithms
PPT
Chap6 photodetectors
PPTX
Bit plane slicing
PPTX
Presentation intruder detection system based on raspberry pi 3 model B
PDF
Mapping mobile robotics
PDF
Understanding cnn
PDF
Architecture Design for Deep Neural Networks I
DOCX
Bessel function
Design & Implementation of a Cube Satellite
Unit 6 - Internet and Intranet Systems Development - IT
ppt on image processing
Kalman filters
Fourier transform
1 tracking systems1
Object tracking survey
Presentation on photonics by prince kushwaha(RJIT)
Lecture Ch 08
Fundamentals of radar signal processing mark a. richards
=SLAM ppt.pdf
Popular search algorithms
Chap6 photodetectors
Bit plane slicing
Presentation intruder detection system based on raspberry pi 3 model B
Mapping mobile robotics
Understanding cnn
Architecture Design for Deep Neural Networks I
Bessel function
Ad

Similar to 2018.02 intro to visual odometry (20)

PPTX
Computer Vision - cameras
PDF
Lec02 camera
PPTX
Computer Vision Course includes deep learning
PPT
Raskar 6Sight Keynote Talk Nov09
PPTX
Computer Vision-UNIT1-2025-PART abcB.pptx
PPT
02 Fall09 Lecture Sept18web
PPTX
Lec11 single view-converted
PPT
2 basic imaging and radiometry
PPTX
Image Processing Algorithms For Deep-Space Autonomous Optical Navigation 2.pptx
PPT
stereo_cameras stereoscopic viewing and stereo algorithms
PPTX
Lecture Summary : Camera Projection
PPTX
Measuring doubles with 8&quot; neaf copy
PPTX
Computational Photography_TED.pptx
PDF
Astronomical data processing of ccd data.pdf
PPTX
Fundamentals of matchmoving
PPTX
Computer Vision-UNIT156-2025-PART A.pptx
PPT
Virtual Reality 3D home applications
PDF
Keynote at Tracking Workshop during ISMAR 2014
PDF
Microlensing Modelling
Computer Vision - cameras
Lec02 camera
Computer Vision Course includes deep learning
Raskar 6Sight Keynote Talk Nov09
Computer Vision-UNIT1-2025-PART abcB.pptx
02 Fall09 Lecture Sept18web
Lec11 single view-converted
2 basic imaging and radiometry
Image Processing Algorithms For Deep-Space Autonomous Optical Navigation 2.pptx
stereo_cameras stereoscopic viewing and stereo algorithms
Lecture Summary : Camera Projection
Measuring doubles with 8&quot; neaf copy
Computational Photography_TED.pptx
Astronomical data processing of ccd data.pdf
Fundamentals of matchmoving
Computer Vision-UNIT156-2025-PART A.pptx
Virtual Reality 3D home applications
Keynote at Tracking Workshop during ISMAR 2014
Microlensing Modelling
Ad

Recently uploaded (20)

PDF
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
PPTX
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
PDF
HPLC-PPT.docx high performance liquid chromatography
PPTX
Derivatives of integument scales, beaks, horns,.pptx
PPTX
2. Earth - The Living Planet earth and life
PPTX
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
PPTX
TOTAL hIP ARTHROPLASTY Presentation.pptx
PPTX
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
PDF
Phytochemical Investigation of Miliusa longipes.pdf
PPTX
Vitamins & Minerals: Complete Guide to Functions, Food Sources, Deficiency Si...
PDF
lecture 2026 of Sjogren's syndrome l .pdf
PPTX
Introduction to Cardiovascular system_structure and functions-1
PDF
Formation of Supersonic Turbulence in the Primordial Star-forming Cloud
PPTX
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
PDF
IFIT3 RNA-binding activity primores influenza A viruz infection and translati...
PDF
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
PPTX
neck nodes and dissection types and lymph nodes levels
PDF
The scientific heritage No 166 (166) (2025)
PPTX
Cell Membrane: Structure, Composition & Functions
PPTX
famous lake in india and its disturibution and importance
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
HPLC-PPT.docx high performance liquid chromatography
Derivatives of integument scales, beaks, horns,.pptx
2. Earth - The Living Planet earth and life
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
TOTAL hIP ARTHROPLASTY Presentation.pptx
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
Phytochemical Investigation of Miliusa longipes.pdf
Vitamins & Minerals: Complete Guide to Functions, Food Sources, Deficiency Si...
lecture 2026 of Sjogren's syndrome l .pdf
Introduction to Cardiovascular system_structure and functions-1
Formation of Supersonic Turbulence in the Primordial Star-forming Cloud
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
IFIT3 RNA-binding activity primores influenza A viruz infection and translati...
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
neck nodes and dissection types and lymph nodes levels
The scientific heritage No 166 (166) (2025)
Cell Membrane: Structure, Composition & Functions
famous lake in india and its disturibution and importance

2018.02 intro to visual odometry

  • 2. CONTENTS 01 Introduction 02 Imaging with a camera 03 Feature detection: Finding points 04 Tracking and matching: Keeping points 05 Camera motion estimation: Epipolar Geometry 06 RANSAC: Handling noisy correspondences 07 A visual odometry pipeline
  • 3. Introduction  odos + metron = rule + measure  Measure position relative to a world coordinate frame  Wheel odometry measures rotations  Slippage difficult to account for  Direction not known  Visual odometry useful in many environments  Opportunity lander  Mars Rover  Autonomous vehicles
  • 4. SfM, SLAM, VO SfM V-SLAM Visual Odometry SLAM = VO + Loop Closure + BA
  • 6. Why use a camera? Why use a camera?  Vast information  Extremely low Size, Weight, and Power (SWaP) footprint  Cheap and easy to use  Passive sensor  Processing power is OK today It’s what nature uses too! Slide courtesy: S. Weiss Cellphone type camera, up to 16Mp (480MB/s @ 30Hz) Cellphone processor unit 1.7GHz quadcore ARM <10g
  • 7. The Camera is a Bearing Sensor Slide courtesy: S. Weiss  Projective sensor which measures the bearing of a point with respect to the optical axis  Depth can be inferred by re-observing a point from different angles  The movement (i.e. the angle between the observations) is the point's parallax  A point at infinity is a feature which exhibits no parallax during camera motion  The distance of a star cannot be inferred by moving a few kilometers  BUT: it is a perfect bearing reference for attitude estimation: NASA's star tracker sensors better than 1 arc second or 0.00027deg
  • 8. Image Formation Let’s design a camera  Idea: Put a piece of film in front of an object  Do we get a reasonable image? Slide courtesy: C. Stachniss, S. Seitz
  • 9. Pinhole Camera Let’s design a camera  Add a barrier to block off most of the rays  This reduces blurring  The opening is known as the aperture  How does this transform the image? Slide courtesy: C. Stachniss, S. Seitz
  • 10. Pinhole Camera  Pinhole camera is a simple model to approximate the imaging process  If we treat pinhole as a point, only 1 ray from any point can enter the camera Slide courtesy: C. Stachniss; Image courtesy: Forsyth and Ponce Virtual image pinhole Image plane
  • 11. Camera Obscura (1544) Slide courtesy: C. Stachniss; Image courtesy http://www.acmi.net.au "Reinerus Gemma-Frisius, observed an eclipse of the sun at Louvain on January 24, 1544, and later he used this illustration of the event in his book De Radio Astronomica et Geometrica, 1545. It is thought to be the first published illustration of a camera obscura..." Hammond, John H., The Camera Obscura, A Chronicle In Latin, means “dark room”
  • 12. Pinhole Camera Model  Similarity of gray triangles  Image scale  Mapping Slide courtesy: C. Stachniss; Image courtesy Foerstner
  • 13. Pinhole Camera Model  Small hole: sharp image but requires large exposure times  Large hole: short exposure times but blurry images  Solution: replace pinhole with lenses Slide courtesy: C. Stachniss; Image courtesy Foerstner
  • 14. Lens Approximates the Pinhole  A lens is only an approximation of the pinhole camera model  The corresponding point on the object and in the image and the centre of the lens should lie on one line  The further away a ray passes the centre of the lens, the larger the error  Use of an aperture to limit the error (trade off between the usable light and price of the lens) Slide courtesy: C. Stachniss
  • 15. Three Assumptions Made in the Pinhole/Thin Lens Model 1. All rays from the object intersect in a single point 2. All image points lie on a plane 3. The ray from the object point to the image point is a straight line Often these assumptions do not hold and lead to imperfect images Slide courtesy: C. Stachniss; Images courtesy: Wikipedia Chromatic aberration Coma Astigmatism
  • 16. Distortion is another common flaw Slide courtesy: C. Stachniss; Image courtesy: Wikipedia Deviation from rectilinear projection, a projection in which straight lines in a scene remain straight in an image barrel distortion pincushion distortion mustache distortion
  • 17. Perspective Effects Objects that are closer appear as larger
  • 18. Vanishing Points Parallel lines converge in the image at vanishing points
  • 19. Vanishing Points Parallel lines converge in the image at vanishing points
  • 20. Perspective Projection Slide courtesy: D. Scaramuzza Convert point in world coordinates to point in camera coordinates Convert point in camera coordinates to image plane Convert point in the image plane to pixel coordinates
  • 21. 3D Point to 2D Pixel Slide courtesy: D. Scaramuzza 1. Point projects to 2. From similar triangles: 3. Same is true for y Convert point in camera coordinates to image plane
  • 22. 2D Plane to 2D Pixel Slide courtesy: D. Scaramuzza Convert point in the image plane to pixel coordinates 1. Account for pixel coordinates of optical centre 2. Account for scale factors
  • 23. 2D Plane to 2D Pixel Image courtesy: D. Scaramuzza Convert point in the image plane to pixel coordinates 1. Use homogeneous coordinates to map from 3D to 2D 2. 𝜆 is the scale
  • 24. 2D Plane to 2D Pixel Image courtesy: D. Scaramuzza Convert point in the image plane to pixel coordinates 1. K is a matrix containing focal lengths and image origin 2. Often called the intrinsic or calibration matrix
  • 25. 3D Point to 3D Point Image courtesy: D. Scaramuzza Convert point in world coordinates to point in camera coordinates
  • 26. 3D Point to 3D Point Image courtesy: D. Scaramuzza Convert point in world coordinates to point in camera coordinates
  • 27. Perspective Projection Image courtesy: D. Scaramuzza Convert point in world coordinates to point in camera coordinates
  • 29. Local Image Features Slide courtesy: D. Scaramuzza  How would you align these images?
  • 30. Local Image Features Slide courtesy: D. Scaramuzza  Detect point features in both images
  • 31. Local Image Features Slide courtesy: D. Scaramuzza  Detect point features in both images  Find corresponding pairs
  • 32. Local Image Features Slide courtesy: D. Scaramuzza  Detect point features in both images  Find corresponding pairs  Use these pairs to align images
  • 33. Matching with Features Slide courtesy: D. Scaramuzza  Problem 1: Detect the same points independently in both images No chance of a match! A repeatable feature detector is required.
  • 34. Matching with Features Slide courtesy: D. Scaramuzza  Problem 2: For each point, find the corresponding point in the other image A reliable and distinctive feature descriptor is required that is invariant to geometric and illumination changes
  • 35. What is a Distinctive Feature? Slide courtesy: D. Scaramuzza  Notice how some patches can be matched with higher accuracy  Patches with detail are good
  • 36. Point Features: Corners vs Blobs Slide courtesy: D. Scaramuzza
  • 37. Finding and Tracking Points Image courtesy: OpenCV Import numpy as np Import cv2 … # Take first frame and find corners in it ret, old_frame = cap.read() old_gray = cv2.cvtColor(old_frame, cv2.COLOR_BGR2GRAY) p0 = cv2.goodFeaturesToTrack(old_gray, mask =None, **feature_params) … while(1): ret,frame = cap.read() frame_gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY) # calculate optical flow p1, st, err = cv2.calcOpticalFlowPyrLK(old_gray, frame_gray, p0,None, **lk_params)  Lukas-Kanade Tracker uses optical flow  Assumes pixel intensities of an object do not change between frames  Assumes neighbouring pixels have similar motion
  • 38. Finding and Matching Points Image courtesy: OpenCV Import numpy as np Import cv2 … # Initiate ORB detector orb = cv.ORB_create() # find the keypoints and descriptors with ORB kp1, des1 = orb.detectAndCompute(img1,None) kp2, des2 = orb.detectAndCompute(img2,None) … # create BFMatcher object bf =cv.BFMatcher(cv.NORM_HAMMING, crossCheck=True) # Match descriptors. matches = bf.match(des1,des2) # Sort them in the order of their distance. matches = sorted(matches, key =lambdax:x.distance) # Draw first 10 matches. img3 =cv.drawMatches(img1,kp1,img2,kp2,matches[:10], flags=2)  OpenCV support BruteForce and FLANN (approximate NN)
  • 40. 2 View Geometry Image courtesy: Auckland University  Goal: to estimate 3D scene structure, camera poses (up to scale factor) and camera intrinsics 2 Cases:  Calibrated cameras: K matrices are known  Uncalibrated cams: K matrices unknown 𝑅, 𝑇
  • 41. 2 View Geometry Image courtesy: Auckland University  Goal: to estimate 3D scene structure, scale and camera intrinsics  Find 𝑅, 𝑇, 𝑃 𝑖 that satisfy 𝑅, 𝑇
  • 42. Scale Ambiguity in Monocular Vision Image courtesy: Amsterdam City Tours  Rescaling the scene by constant factor results in same image  We cannot recover the scale!  Only 5 degrees of freedom: 3 for rotation, 2 for direction of translation (but no scale)
  • 43. Calibrated Cameras  Use normalised image coordinates  Find 𝑅, 𝑇, 𝑃 𝑖 that satisfy
  • 44. Essential Matrix Slide courtesy: D. Scaramuzza  𝑝𝑙, 𝑝 𝑟, 𝑇 form a plane (coplanar)
  • 45. Fundamental Matrix Slide courtesy: D. Scaramuzza  𝑝𝑙, 𝑝 𝑟, 𝑇 form a plane (coplanar)
  • 46. Recovering Pose Slide courtesy: S. Weiss  Recover 𝑅, 𝑇 from the essential matrix  Only one solution where point is in front of both cameras  Apply motion constancy (Kalman Filter?)
  • 47. RANSAC Slide courtesy: S. Weiss Assume:  The model parameters can be estimated from N data items (e.g. essential matrix from 5-8 points)  There are M data items in total. The algorithm: 1. Select N data items at random 2. Estimate parameters (linear or nonlinear least square, or other) 3. Find how many data items (of M) fit the model with parameter vector within a user given tolerance, T. Call this k. if K is the largest (best fit) so far, accept it. Repeat 1. to 4. S times
  • 50. Putting It All Together Import numpy as np Import cv2 … while(1): … # calculate the essential matrix from matched/ tracked points E=cv2.findEssentialMat(points2,points1,focal,pp,RANSAC,0.999,1.0,mask) # Recover R and t from the essential matrix R, t = cv2.recoverPose(E,points2,points1,focal,pp,mask) Image courtesy: A. Singh
  • 51. Putting It All Together Image courtesy: A. Singh https://www.youtube.com/watch?v=homos4vd_Zs

Editor's Notes

  • #28: = \begin{bmatrix}  r_{11}&r_{12}&r_{13}&t_1\\  r_{21}&r_{22}&r_{23}&t_2\\  r_{31}&r_{32}&r_{33}&t_3 \end{bmatrix} \begin{bmatrix}  X_w \\  Y_w \\  Z_w \\   1 \end{bmatrix} = \left[ \mathbf{R} | T \right] \begin{bmatrix}  X_w \\  Y_w \\  Z_w \\   1 \end{bmatrix}
  • #44: \lambda_l \begin{bmatrix}  \bar{u}^i_l \\  \bar{v}^i_l \\  1 \end{bmatrix} =  \left[ \mathbf{I} | 0 \right] \begin{bmatrix}  X^i_w \\  Y^i_w \\  Z^i_w \\  1 \end{bmatrix}
  • #46: If we only have pixel coordinates for matches, then use fundamental matrix Use essential matrix is we have normalised image coordinates (i.e. projection before intrinsic applied)
  • #49: Questions: ●What is the tolerance? ● How many trials, S, ensure success?