Robert Collins
CSE486, Penn State




                          Lecture 25:
                     Structure from Motion
Robert Collins
CSE486, Penn State
                         Structure from Motion
     Given a set of flow fields or displacement vectors from
     a moving camera over time, determine:
        • the sequence of camera poses
        • the 3D structure of the scene




                                           Sequence of camera poses




                Scene structure
Robert Collins
CSE486, Penn State
                      SFM “Killer App”

    Match Move
       Track a set of feature points through a movie sequence

       Deduce where the cameras are and the 3D
         locations of the points that were tracked

       Render synthetic objects with respect to the
         deduced 3D geometry of the scene / cameras
Robert Collins
CSE486, Penn State
                        Match Move Examples




           “Harts’ War” and “Graham Kimpton” examples from www.realviz.com
           MatchMover Professional gallery. Copyrighted.
Robert Collins
CSE486, Penn State
                             Factorization
        Tomasi and Kanade, “Shape and Motion from Image
        Streams under Orthography,” International Journal
        of Computer Vision (IJCV), Vol 9, pp.137-154, 1992.

        Goal: combine point correspondence information from
        multiple points over multiple frames to solve for scene
        structure and camera motion (structure from motion)

        Approach: numerically stable approach based on using
        SVD to “factor” matrix of observed point positions.

        Historical significance: until that time, most SFM work
        dealt with minimal configurations, and noise-free data.
        Factorization was one of the first “practical SFM algorithms”
Robert Collins

        Recall : World to Camera Transform
CSE486, Penn State




                       P C = R ( PW - C )

            C
           Px        r11 r12 r13    1   0   0  cx   PWx
            C
           Py        r21 r22 r23    0   1   0  cy    W
                                                      Py
            C
           Pz        r31 r32 r33    0   0   1  cz    W
                                                      Pz
            1        0 0 0 1         0   0   0 1       1


                          PC = Mext . PW
Robert Collins
CSE486, Penn State
                     Perspective Projection

                                    P
                                                     X
                              y                 x f
                 Y   x
                                                     Z
                          p
                                                     Y
    X                         f                 y f
                                                     Z
                     Z
                 O

                     •Non-linear equations
                     •Any point on the ray OP has image p !!
Robert Collins
CSE486, Penn State
                      Perspective Projection


                     x f X
                          Z
                     y f Y
                          Z
          Perspective Projection : parallel lines appear to meet
          at a vanishing point; farther objects seem smaller



O.Camps, PSU
Robert Collins

                Simplification: Weak Perspective
CSE486, Penn State




                          f
                     x        X
                          Zo
                          f
                     y        Y
                          Zo
          Weak perspective = Parallel projection (parallel
          lines remain parallel) + Scaling to simulate change
          in size due to object distance.
Robert Collins

               Simpler: Orthographic Projection
CSE486, Penn State




                     xX

                     yY

          Pure parallel projection. Highly simplified case
          where we even ignore the scaling due to distance.
Robert Collins

                     Perspective Matrix Equation
CSE486, Penn State


                             (Camera Coordinates)



                              Using homogeneous coordinates:
               X
          x f                                                    X 
               Z              x'  f              0    0     0  
               Y              y '   0           f    0      Y 
                                                               0
          y f                                                 Z 
               Z              z'  0
                                                 0    1     0  
                                                                 1
                                                                   
                                         x'                  y'
                                      x                y
                                         z'                  z'
Robert Collins
CSE486, Penn State
             Weak Perspective Approximation

                               Using homogeneous coordinates:
                     f
       x                 X                                X 
                     Zo        x'  f Z 0 0          0  
                               y '   0 0 f 0       0Y 
                     f          
       y                 Y                   Z0           Z 
                     Zo        z'  0 0 1
                                              0    Z0   
                                                       0
                                                           1
                                       x'            y'
                                    x         y
                                       z'            z'
Robert Collins

                     Let’s Consider Orthographic
CSE486, Penn State




                              Using homogeneous coordinates:

       x X                                               X 
                              x'  f 01       0     0  
                              y '   0 1
                                          f     0     0Y 
       y Y                                            Z 
                              z'  0 0
                                             1
                                                0    Z0   
                                                      0
                                                          1
                                      x'            y'
                                   x         y
                                      z'            z'
Robert Collins
CSE486, Penn State
                     Combine with External Params

                                                                W
    x           1 0 0 0       r11 r12 r13    1   0   0  cx   Px
                                                                W
    y           0 1 0 0       r21 r22 r23    0   1   0  cy   Py
                                                                W
                              r31 r32 r33    0   0   1  cz   Pz
                              0 0 0 1         0   0   0 1      1

                                                                W
    x                         r11 r12 r13    1   0   0  cx   Px
                                                                W
    y                         r21 r22 r23    0   1   0  cy   Py
                                                                W
                                              0   0   1  cz   Pz
                                              0   0   0 1      1
Robert Collins
CSE486, Penn State
                     Combine with External Params
                                                                  W
                     x       r11 r12 r13     1   0     0  cx   Px
                                                                  W
                     y       r21 r22 r23     0   1     0  cy   Py
                                                                  W
                                              0   0     1  cz   Pz
                                              0   0     0 1      1


                         x      r11 r12 r13        W
                                                  Px    cx
                         y      r21 r22 r23        W
                                                  Py    cy
                                                  PW
                                                   z    cz
Robert Collins

               Orthographic: Algebraic Equation
CSE486, Penn State




                                       iT    P      T
                     x       r11 r12   r13    W
                                             Px    cx
                     y       r21 r22 r23      W
                                             Py    cy
                               jT             W
                                             Pz    cz


                         x = iT ( P - T )
                         y = jT ( P - T )
Robert Collins
CSE486, Penn State
                    Multiple Points, Multiple Frames
                       Notation (attack of the killer subscripts)
                                                 N points
                 x = iT ( P - T )
                                             P1 P2 … P j … P N
                 y = jT ( P - T )


                   i1 i2 … i i … iF           xij = iiT ( Pj - Ti )
      F frames




                   j1 j2 … ji … jF
                                              yij = jiT ( Pj - Ti )
                   T1 T2 …Ti …TF
                                                          Eq 8.31-8.32
                                                          T&V book
Robert Collins
CSE486, Penn State
                       Factorization Approach

              xij = iiT ( Pj - Ti )             N points
                                           P1 P2 … P j … P N
              yij =   jiT   ( Pj - Ti )
                                          (We want to recover these)

         Note that absolute position of the set of points is
         something that cannot be uniquely recovered, so…

         First Trick: set the origin of the world coordinate
         system to be the center of pass of the N points!
                                 N

                             N
Robert Collins
CSE486, Penn State
                           Factorization Approach
                                                      Centroid at 0:
              xij =      i iT   ( Pj - Ti )
                                                                 N
              yij = jiT ( Pj - Ti )                      N



    Implication:
                     N                            N                  N
      it
             N                                N              N


    Note: this is the center of mass of x coordinates in frame t
Robert Collins
CSE486, Penn State
                     Factorization Approach




          Second Trick: subtract off the center of mass of the
          2D points in each frame. (Centering)

          xij = iiT ( Pj - Ti )
          yij = jiT ( Pj - Ti )
Robert Collins
CSE486, Penn State
                             Factorization Approach

                                          centering
          xij =      i iT   ( Pj - Ti )
          yij = jiT ( Pj - Ti )


       What have we accomplished so far?

    1) Removed unknown camera locations from equations.

    2) More importantly, we can now write everything
        As a big matrix equation…
Robert Collins
CSE486, Penn State
                     Factorization Approach
   Form a matrix of centered
   image points.          2FxN
                         ~ ~ ~ …~
                         x11 x12 x13 x1N

    All N points
                         ~ ~ ~ …~x
                         xF1 xF2 xF3
    in one frame                     FN
                         ~ ~ ~ …~
                         y11 y12 y13 y1N


                         ~ ~ ~ …~y
                         yF1 yF2 yF3 FN
Robert Collins
CSE486, Penn State
                     Factorization Approach
   Form a matrix of centered
   image points.          2FxN
                         ~ ~ ~ …~
                         x11 x12 x13 x1N

    Tracking one         ~ ~ ~ …~x
    point through        xF1 xF2 xF3 FN
    all F frames         ~ ~ ~ …~
                         y11 y12 y13 y1N


                         ~ ~ ~ …~y
                         yF1 yF2 yF3 FN
Robert Collins
         Factorization Approach
CSE486, Penn State
                                      it         it

   matrix of centered image points:   it         it

           2FxN              2Fx3
      ~ ~ ~ …~
      x11 x12 x13 x1N          i1T
                                           3xN
      ~ ~ ~ …~x
      xF1 xF2 xF3 FN     =     iFT    P1 P2           PN
      ~ ~ ~ …~
      y11 y12 y13 y1N          j1T

      ~ ~ ~ …~y
      yF1 yF2 yF3 FN           jFT
Robert Collins
CSE486, Penn State
                       Factorization Approach
                     2F x N     2F x 3    3xN

                     W = M S
          Centered                               Structure
                              “Motion”
         measurement                            (3D scene
                              (camera
            matrix                                points)
                              rotation)
Robert Collins
CSE486, Penn State
                       Factorization Approach
                     2F x N     2F x 3   3xN

                     W = M S
       Rank Theorem:
           The 2FxN centered observation matrix has
           at most rank 3.
       Proof:
          Trivial, using the properties:
          • rank of mxn matrix is at most min(m,n)
          • rank of A*B is at most min(rank(A),rank(B))
Robert Collins
CSE486, Penn State
                            Rank of a Matrix

      What is rank of a matrix, anyways?

         Number of columns (rows) that are linearly independent.

         If matrix A is treated as a linear map, it is the intrinsic
         dimension of the space that is mapped into.

                                     MxN matrix
                M-dimensional                              N-dimensional
                space                   A                  space



                                   This matrix would
                                   have rank 1
Robert Collins
CSE486, Penn State
                     Factorization Rank Theorem

        Importance of rank theorem:

          •Shows that video data is highly redundant

          •Precisely quantifies the redundancy

          •Suggests an algorithm for solving SFM
Robert Collins
CSE486, Penn State
                     Factorization Approach

       Form SVD of measurement matrix W
          2FxN            2Fx2F    2FxN      NxN

         W = U                      D         V T

                         Diagonal matrix with eigenvalues
                         sorted in decreasing order:
                          d11 >= d22 >= d33 >= …
Robert Collins
CSE486, Penn State
                     Factorization Approach

       Form SVD of measurement matrix W
          2FxN               2Fx2F       2FxN        NxN

         W = U                           D           V T
       Another useful rank property:
          Rank of a matrix is equal to the number of
          nonzero eigenvalues.
          d11, d22, d33 are only nonzero eigenvalues (the rest are 0).
Robert Collins
CSE486, Penn State
                         Factorization Approach
       2FxN                  2Fx2F       2FxN              NxN


                     =               *             *

                                                Eigenvalues in
                                                decreasing order
Robert Collins
CSE486, Penn State
                         Factorization Approach
       2FxN                  2Fx2F              2FxN                   NxN


                     =                   *                     *


   Rank theorem says:                              These should be zero
                            These 3 are nonzero
                                  In practice, due to noise, there may be more than
                                  3 nonzero eigenvalues, but rank theorem tells us
                                  to ignore all but the largest three.
Robert Collins
CSE486, Penn State
                         Factorization Approach
       2FxN                      2Fx2F       2FxN        NxN
                          2Fx3               3x3         3xN

                     =                   *           *



                     W = U’ D’                     V’T
Robert Collins
CSE486, Penn State
                        Factorization Approach
        Observed image points


                     W = U’ D’
                            SVD
                                                        V’T


                     W = U’                 D’1/2        D’1/2      V’T
                     2FxN                  2Fx3               3xN


                     W = M S
                                  Camera      Scene
                                  motion    structure
Robert Collins
CSE486, Penn State
                        Annoying Details

       W = (U’               D’1/2      )(D’1/2       V’T)
                     2FxN       2Fx3    3xN

                     W = M S
   Problems:
     1) This is not a unique decomposition.
         eg: (M Q) (Q-1 S) = M Q Q-1 S = M S
        2) iT, jT pairs (rows of M) are not necessarily orthogonal
Robert Collins
CSE486, Penn State
                     Solving the Annoying Details
    Solution to both problems:

      Solve for Q such that appropriate rows of M satisfy

                                             unit vectors

                                             orthogonal

      3N equations in 9 unknowns
      But these are nonlinear equations
          linearize and iterate
         (see Exercise 8.8 in book for Newton’s method)
            (alternative approach is to use Cholesky decomposition – outside our scope)
Robert Collins
CSE486, Penn State
                     Factorization Summary
       Assumptions
        - orthographic camera
        - N non-coplanar points tracking in F>=3 frames
                                                   ~   ~
        Form the centered measurement matrix W=[X ; Y]
                  ~
         - where xij = xij – mxj
                  ~
         - where yij = yij – myj
         - mxj and myj are mean of points in frame i
         - j ranges over set of points

        Rank theorem: The centered measurement matrix
        has a rank of at most 3
Robert Collins
CSE486, Penn State
                     Factorization Algorithm
   1) Form the centered measurement matrix W from N points
         tracked over F frames.
   2) Compute SVD of W = U D VT
       - U is 2Fx2F
       - D is 2FxN
       - VT is NxN
   3) Take largest 3 eigenvalues, and form
       - D’ = 3x3 diagonal matrix of largest eigenvalues
       - U’ = 2Fx3 matrix of corresponding column vectors from U
       - V’T = 3xN matrix of corresponding row vectors from VT
   4) Define
        M = U’ D’1/2 and S = D’1/2 V’T
   5) Solve for Q that makes appropriate rows of M orthogonal
   6) Final solution is
        M* = M Q and S* = Q-1 S
Robert Collins
CSE486, Penn State
                                                       Sample Results



                            QuickTime™ and a
                          Cinepak decompressor
                     are needed to see this picture.
Robert Collins
CSE486, Penn State
                                                       Sample Results


                            QuickTime™ and a
                          Cinepak decompressor
                     are needed to see this picture.

More Related Content

PDF
Lecture08
PDF
Lecture04
DOCX
Lane departure identification for advanced driver assistance
PPT
Edge detection-LOG
PDF
Lecture 8
PDF
ICDE2010: DBMS: Lessons from the First 50 Years, Speculations for the Next 50
PDF
CVPR2010: Advanced ITinCVPR in a Nutshell: part 4: additional slides
PPS
789d600f 9574 4e73 977f 3f717cb0369a Les40ansdemadame
Lecture08
Lecture04
Lane departure identification for advanced driver assistance
Edge detection-LOG
Lecture 8
ICDE2010: DBMS: Lessons from the First 50 Years, Speculations for the Next 50
CVPR2010: Advanced ITinCVPR in a Nutshell: part 4: additional slides
789d600f 9574 4e73 977f 3f717cb0369a Les40ansdemadame

Viewers also liked (20)

PDF
ECCV2010 tutorial: statisitcal and structural recognition of human actions pa...
PDF
CVPR2010: Advanced ITinCVPR in a Nutshell: part 6: Mixtures
PDF
ICCV2009: MAP Inference in Discrete Models: Part 1: Introduction
PDF
NIPS2009: Understand Visual Scenes - Part 1
PPT
ECCV2008: MAP Estimation Algorithms in Computer Vision - Part 1
PDF
G'MIG water color filter tutorial
PDF
ICCV2009: MAP Inference in Discrete Models: Part 4
PDF
ETHZ CV2012: Information
PPT
CVPR2009: Object Detection Using a Max-Margin Hough Transform
PPTX
ICCV2009: Max-Margin Ađitive Classifiers for Detection
PPT
Image processing3 imageenhancement(histogramprocessing)
PDF
Lecture10
PPTX
Pixelrelationships
PPTX
Fuzzy Logic Based Edge Detection
PDF
CVML2011: human action recognition (Ivan Laptev)
PDF
Lecture17
PPT
Image processing spatialfiltering
PDF
Lecture23
PDF
Lecture11
PPS
10 Le Touriste Est Blagueur
ECCV2010 tutorial: statisitcal and structural recognition of human actions pa...
CVPR2010: Advanced ITinCVPR in a Nutshell: part 6: Mixtures
ICCV2009: MAP Inference in Discrete Models: Part 1: Introduction
NIPS2009: Understand Visual Scenes - Part 1
ECCV2008: MAP Estimation Algorithms in Computer Vision - Part 1
G'MIG water color filter tutorial
ICCV2009: MAP Inference in Discrete Models: Part 4
ETHZ CV2012: Information
CVPR2009: Object Detection Using a Max-Margin Hough Transform
ICCV2009: Max-Margin Ađitive Classifiers for Detection
Image processing3 imageenhancement(histogramprocessing)
Lecture10
Pixelrelationships
Fuzzy Logic Based Edge Detection
CVML2011: human action recognition (Ivan Laptev)
Lecture17
Image processing spatialfiltering
Lecture23
Lecture11
10 Le Touriste Est Blagueur
Ad

More from zukun (20)

PDF
My lyn tutorial 2009
PDF
ETHZ CV2012: Tutorial openCV
PDF
Siwei lyu: natural image statistics
PDF
Lecture9 camera calibration
PDF
Brunelli 2008: template matching techniques in computer vision
PDF
Modern features-part-4-evaluation
PDF
Modern features-part-3-software
PDF
Modern features-part-2-descriptors
PDF
Modern features-part-1-detectors
PDF
Modern features-part-0-intro
PDF
Lecture 02 internet video search
PDF
Lecture 01 internet video search
PDF
Lecture 03 internet video search
PDF
Icml2012 tutorial representation_learning
PPT
Advances in discrete energy minimisation for computer vision
PDF
Gephi tutorial: quick start
PDF
EM algorithm and its application in probabilistic latent semantic analysis
PDF
Object recognition with pictorial structures
PDF
Iccv2011 learning spatiotemporal graphs of human activities
PDF
Icml2012 learning hierarchies of invariant features
My lyn tutorial 2009
ETHZ CV2012: Tutorial openCV
Siwei lyu: natural image statistics
Lecture9 camera calibration
Brunelli 2008: template matching techniques in computer vision
Modern features-part-4-evaluation
Modern features-part-3-software
Modern features-part-2-descriptors
Modern features-part-1-detectors
Modern features-part-0-intro
Lecture 02 internet video search
Lecture 01 internet video search
Lecture 03 internet video search
Icml2012 tutorial representation_learning
Advances in discrete energy minimisation for computer vision
Gephi tutorial: quick start
EM algorithm and its application in probabilistic latent semantic analysis
Object recognition with pictorial structures
Iccv2011 learning spatiotemporal graphs of human activities
Icml2012 learning hierarchies of invariant features
Ad

Lecture25

  • 1. Robert Collins CSE486, Penn State Lecture 25: Structure from Motion
  • 2. Robert Collins CSE486, Penn State Structure from Motion Given a set of flow fields or displacement vectors from a moving camera over time, determine: • the sequence of camera poses • the 3D structure of the scene Sequence of camera poses Scene structure
  • 3. Robert Collins CSE486, Penn State SFM “Killer App” Match Move Track a set of feature points through a movie sequence Deduce where the cameras are and the 3D locations of the points that were tracked Render synthetic objects with respect to the deduced 3D geometry of the scene / cameras
  • 4. Robert Collins CSE486, Penn State Match Move Examples “Harts’ War” and “Graham Kimpton” examples from www.realviz.com MatchMover Professional gallery. Copyrighted.
  • 5. Robert Collins CSE486, Penn State Factorization Tomasi and Kanade, “Shape and Motion from Image Streams under Orthography,” International Journal of Computer Vision (IJCV), Vol 9, pp.137-154, 1992. Goal: combine point correspondence information from multiple points over multiple frames to solve for scene structure and camera motion (structure from motion) Approach: numerically stable approach based on using SVD to “factor” matrix of observed point positions. Historical significance: until that time, most SFM work dealt with minimal configurations, and noise-free data. Factorization was one of the first “practical SFM algorithms”
  • 6. Robert Collins Recall : World to Camera Transform CSE486, Penn State P C = R ( PW - C ) C Px r11 r12 r13  1 0 0  cx PWx C Py r21 r22 r23  0 1 0  cy W Py C Pz r31 r32 r33  0 0 1  cz W Pz 1 0 0 0 1 0 0 0 1 1 PC = Mext . PW
  • 7. Robert Collins CSE486, Penn State Perspective Projection P X y x f Y x Z p Y X f y f Z Z O •Non-linear equations •Any point on the ray OP has image p !!
  • 8. Robert Collins CSE486, Penn State Perspective Projection x f X Z y f Y Z Perspective Projection : parallel lines appear to meet at a vanishing point; farther objects seem smaller O.Camps, PSU
  • 9. Robert Collins Simplification: Weak Perspective CSE486, Penn State f x X Zo f y Y Zo Weak perspective = Parallel projection (parallel lines remain parallel) + Scaling to simulate change in size due to object distance.
  • 10. Robert Collins Simpler: Orthographic Projection CSE486, Penn State xX yY Pure parallel projection. Highly simplified case where we even ignore the scaling due to distance.
  • 11. Robert Collins Perspective Matrix Equation CSE486, Penn State (Camera Coordinates) Using homogeneous coordinates: X x f X  Z  x'  f 0 0 0   Y  y '   0 f 0 Y  0 y f    Z  Z  z'  0    0 1 0    1   x' y' x y z' z'
  • 12. Robert Collins CSE486, Penn State Weak Perspective Approximation Using homogeneous coordinates: f x X X  Zo  x'  f Z 0 0 0    y '   0 0 f 0 0Y  f    y Y Z0 Z  Zo  z'  0 0 1    0 Z0    0 1 x' y' x y z' z'
  • 13. Robert Collins Let’s Consider Orthographic CSE486, Penn State Using homogeneous coordinates: x X X   x'  f 01 0 0    y '   0 1 f 0 0Y  y Y    Z   z'  0 0    1 0 Z0    0 1 x' y' x y z' z'
  • 14. Robert Collins CSE486, Penn State Combine with External Params W x 1 0 0 0 r11 r12 r13  1 0 0  cx Px W y 0 1 0 0 r21 r22 r23  0 1 0  cy Py W r31 r32 r33  0 0 1  cz Pz 0 0 0 1 0 0 0 1 1 W x r11 r12 r13  1 0 0  cx Px W y r21 r22 r23  0 1 0  cy Py W 0 0 1  cz Pz 0 0 0 1 1
  • 15. Robert Collins CSE486, Penn State Combine with External Params W x r11 r12 r13  1 0 0  cx Px W y r21 r22 r23  0 1 0  cy Py W 0 0 1  cz Pz 0 0 0 1 1 x r11 r12 r13 W Px  cx y r21 r22 r23 W Py  cy PW z  cz
  • 16. Robert Collins Orthographic: Algebraic Equation CSE486, Penn State iT P T x r11 r12 r13 W Px  cx y r21 r22 r23 W Py  cy jT W Pz  cz x = iT ( P - T ) y = jT ( P - T )
  • 17. Robert Collins CSE486, Penn State Multiple Points, Multiple Frames Notation (attack of the killer subscripts) N points x = iT ( P - T ) P1 P2 … P j … P N y = jT ( P - T ) i1 i2 … i i … iF xij = iiT ( Pj - Ti ) F frames j1 j2 … ji … jF yij = jiT ( Pj - Ti ) T1 T2 …Ti …TF Eq 8.31-8.32 T&V book
  • 18. Robert Collins CSE486, Penn State Factorization Approach xij = iiT ( Pj - Ti ) N points P1 P2 … P j … P N yij = jiT ( Pj - Ti ) (We want to recover these) Note that absolute position of the set of points is something that cannot be uniquely recovered, so… First Trick: set the origin of the world coordinate system to be the center of pass of the N points! N N
  • 19. Robert Collins CSE486, Penn State Factorization Approach Centroid at 0: xij = i iT ( Pj - Ti ) N yij = jiT ( Pj - Ti ) N Implication: N N N it N N N Note: this is the center of mass of x coordinates in frame t
  • 20. Robert Collins CSE486, Penn State Factorization Approach Second Trick: subtract off the center of mass of the 2D points in each frame. (Centering) xij = iiT ( Pj - Ti ) yij = jiT ( Pj - Ti )
  • 21. Robert Collins CSE486, Penn State Factorization Approach centering xij = i iT ( Pj - Ti ) yij = jiT ( Pj - Ti ) What have we accomplished so far? 1) Removed unknown camera locations from equations. 2) More importantly, we can now write everything As a big matrix equation…
  • 22. Robert Collins CSE486, Penn State Factorization Approach Form a matrix of centered image points. 2FxN ~ ~ ~ …~ x11 x12 x13 x1N All N points ~ ~ ~ …~x xF1 xF2 xF3 in one frame FN ~ ~ ~ …~ y11 y12 y13 y1N ~ ~ ~ …~y yF1 yF2 yF3 FN
  • 23. Robert Collins CSE486, Penn State Factorization Approach Form a matrix of centered image points. 2FxN ~ ~ ~ …~ x11 x12 x13 x1N Tracking one ~ ~ ~ …~x point through xF1 xF2 xF3 FN all F frames ~ ~ ~ …~ y11 y12 y13 y1N ~ ~ ~ …~y yF1 yF2 yF3 FN
  • 24. Robert Collins Factorization Approach CSE486, Penn State it it matrix of centered image points: it it 2FxN 2Fx3 ~ ~ ~ …~ x11 x12 x13 x1N i1T 3xN ~ ~ ~ …~x xF1 xF2 xF3 FN = iFT P1 P2 PN ~ ~ ~ …~ y11 y12 y13 y1N j1T ~ ~ ~ …~y yF1 yF2 yF3 FN jFT
  • 25. Robert Collins CSE486, Penn State Factorization Approach 2F x N 2F x 3 3xN W = M S Centered Structure “Motion” measurement (3D scene (camera matrix points) rotation)
  • 26. Robert Collins CSE486, Penn State Factorization Approach 2F x N 2F x 3 3xN W = M S Rank Theorem: The 2FxN centered observation matrix has at most rank 3. Proof: Trivial, using the properties: • rank of mxn matrix is at most min(m,n) • rank of A*B is at most min(rank(A),rank(B))
  • 27. Robert Collins CSE486, Penn State Rank of a Matrix What is rank of a matrix, anyways? Number of columns (rows) that are linearly independent. If matrix A is treated as a linear map, it is the intrinsic dimension of the space that is mapped into. MxN matrix M-dimensional N-dimensional space A space This matrix would have rank 1
  • 28. Robert Collins CSE486, Penn State Factorization Rank Theorem Importance of rank theorem: •Shows that video data is highly redundant •Precisely quantifies the redundancy •Suggests an algorithm for solving SFM
  • 29. Robert Collins CSE486, Penn State Factorization Approach Form SVD of measurement matrix W 2FxN 2Fx2F 2FxN NxN W = U D V T Diagonal matrix with eigenvalues sorted in decreasing order: d11 >= d22 >= d33 >= …
  • 30. Robert Collins CSE486, Penn State Factorization Approach Form SVD of measurement matrix W 2FxN 2Fx2F 2FxN NxN W = U D V T Another useful rank property: Rank of a matrix is equal to the number of nonzero eigenvalues. d11, d22, d33 are only nonzero eigenvalues (the rest are 0).
  • 31. Robert Collins CSE486, Penn State Factorization Approach 2FxN 2Fx2F 2FxN NxN = * * Eigenvalues in decreasing order
  • 32. Robert Collins CSE486, Penn State Factorization Approach 2FxN 2Fx2F 2FxN NxN = * * Rank theorem says: These should be zero These 3 are nonzero In practice, due to noise, there may be more than 3 nonzero eigenvalues, but rank theorem tells us to ignore all but the largest three.
  • 33. Robert Collins CSE486, Penn State Factorization Approach 2FxN 2Fx2F 2FxN NxN 2Fx3 3x3 3xN = * * W = U’ D’ V’T
  • 34. Robert Collins CSE486, Penn State Factorization Approach Observed image points W = U’ D’ SVD V’T W = U’ D’1/2 D’1/2 V’T 2FxN 2Fx3 3xN W = M S Camera Scene motion structure
  • 35. Robert Collins CSE486, Penn State Annoying Details W = (U’ D’1/2 )(D’1/2 V’T) 2FxN 2Fx3 3xN W = M S Problems: 1) This is not a unique decomposition. eg: (M Q) (Q-1 S) = M Q Q-1 S = M S 2) iT, jT pairs (rows of M) are not necessarily orthogonal
  • 36. Robert Collins CSE486, Penn State Solving the Annoying Details Solution to both problems: Solve for Q such that appropriate rows of M satisfy unit vectors orthogonal 3N equations in 9 unknowns But these are nonlinear equations linearize and iterate (see Exercise 8.8 in book for Newton’s method) (alternative approach is to use Cholesky decomposition – outside our scope)
  • 37. Robert Collins CSE486, Penn State Factorization Summary Assumptions - orthographic camera - N non-coplanar points tracking in F>=3 frames ~ ~ Form the centered measurement matrix W=[X ; Y] ~ - where xij = xij – mxj ~ - where yij = yij – myj - mxj and myj are mean of points in frame i - j ranges over set of points Rank theorem: The centered measurement matrix has a rank of at most 3
  • 38. Robert Collins CSE486, Penn State Factorization Algorithm 1) Form the centered measurement matrix W from N points tracked over F frames. 2) Compute SVD of W = U D VT - U is 2Fx2F - D is 2FxN - VT is NxN 3) Take largest 3 eigenvalues, and form - D’ = 3x3 diagonal matrix of largest eigenvalues - U’ = 2Fx3 matrix of corresponding column vectors from U - V’T = 3xN matrix of corresponding row vectors from VT 4) Define M = U’ D’1/2 and S = D’1/2 V’T 5) Solve for Q that makes appropriate rows of M orthogonal 6) Final solution is M* = M Q and S* = Q-1 S
  • 39. Robert Collins CSE486, Penn State Sample Results QuickTime™ and a Cinepak decompressor are needed to see this picture.
  • 40. Robert Collins CSE486, Penn State Sample Results QuickTime™ and a Cinepak decompressor are needed to see this picture.