SlideShare a Scribd company logo
Compositional 	
  Inverse Composite Alignment
 of a sphere under orthogonal
projection for ball spin estimation

             玉木徹(広島大)	
      牛山幸彦(新潟大学教育人間科学部)

    Bisser Raytchev(広島大), 金田和文(広島大)	
                                        1
研究の背景
           動画像処理技術



●  バイオメカニクス	
     ●    講習会・体育授業	
●  スキル判定	
             n    ビデオ撮影・上映	
●  チームプレイ情勢判断	
                       n    指導者の直接指導	
●  放送映像可視化技術	

                       多数の生徒に対する教師
少数の生徒に対する指導	
          や指導者による対面指導	
生徒の自習	




                                          2
スポーツ指導とスキル獲得
スキル獲得における3つのプロセス(工藤, 2000)	
1.  全身協応動作の発現
  n    動作を教え込まず、明確な課   練習者の課題となる
        題を与え発現を待つこと	
   目標の提示	
  n    適切な時点に適切な熟練者
        の指導言葉を与えること	
2.  練習による洗練             適切な指導のための
                        資料の提示	
   n  新しい情報を与えすぎないこ
       と(過度の依存を避ける)	
                        分析過程が見えない	
   n  内在的フィードバックを重視    情報をフィードバック	
       し、付加的フィードバックは補   しない	
       助的に用いること	
                         システムとして
3.  自動化                   目指すもの
                                    3
Table Tennis
                                                      玉木徹, 牛山幸彦, 八坂剛史 : 「スポーツ選手の技能向上のための動画像処理とその実用化」,
                                                      電子情報通信学会技術報告 パターン認識・メディア理解研究会 PRMU2005-116, No.116, pp.
                                                      13-18, 広島市立大学, 広島 (2005 11). ,




         ■         Factors of table tennis
                       n       ball speed and spin
         ■         Ball spin (rotation)
                       n       One of major factors that affects
                                performance
                                      l    Average 8000 [rpm] for Chinese national team
                                            (Wu, 1992)	
                       n       Great effect on a rally
                                      l    Large angle change after the impact to a
                                            racket	
         ■         Conventional measurement
                       n       Counting by manual	
                       n       Official ball was changed in radius
                                      l    In 2000, 38mm to 40mm
                                      l    Effect investigated (Tang 2002, Kawazoe
                                            2004)	



           ●     Method: motion estimation from a video
                                                      	
           ●     Goal: feedback spin to a player	
Wu Huan Qun, Qin Zhifeng, Xu Shaofa, Xi Enting: Experimental Research in Table Tennis Spin,“ International Journal of Table Tennis Science, The International Table Tennis Federation, No. 1, pp.73-79 (1992 08).
Hai-peng Tang, Masato Mizoguchi, Shintaro Toyoshima: Speed and spin characteristics of the 40mm table tennis ball," Table Tennis Sciences, No. 4&5, ITTF Sports Science Committee, pp.278{284 (2002)
Y. Kawazoe, D. Suzuki, Comparison of the 40 and 38 mm table tennis balls in terms of impact with a racket based on predicted impact phenomena," in Science and Racket Sports III, pp.140{145 (2004).	
                                                                                                                                                                                                                    4
Related work
                                                                                                                       color corner tracking, stereo	
 B&W corner tracking (2D rotation)	




 Alexander Szep, Quantifying Rotations of Spheric Objects, The Twelfth IAPR Conference on Machine
 Vision Applications (MVA2010), 2010.	




One shot estimation from motion blur	
                                                                                                                        Christian Theobalt, Irene Albrecht, Jörg Haber Haber, Marcus Magnor, and Hans-
                                                                                                                        Peter Seidel. 2004. Pitching a baseball: tracking high-speed motion with multi-
                                                                                                                        exposure images. In ACM SIGGRAPH 2004 Papers (SIGGRAPH '04), Joe Marks
                                                                                                                        (Ed.). ACM, New York, NY, USA, 540-547. DOI=10.1145/1186562.1015758	




Giacomo Boracchi and Vincenzo Caglioti and Alessandro Giusti. Estimation of 3d instantaneous motion of a ball from a
single motion-blurred image. In VISIGRAPP, Computer Vision and Computer Graphics. Theory and Applications,
Communications in Computer and Information Science, 2009, Volume 24, Part 3, Part 6, 225-237, DOI:
10.1007/978-3-642-10226-4_18	
                                                                                                                                                                            5
Related work
  Parametric                               Motion parameter estimation
  Eigenspace method                        by minimizing intensity error	
  with CG	



                                                                                                                             Shape and motion
                                                                                                                             from optical flow




                                            Shum, H.; Komura, T.; , "Tracking the translational and rotational
                                            movement of the ball using high-speed camera movies," Image
                                            Processing, 2005. ICIP 2005. IEEE International Conference on , vol.3,
                                            no., pp. III- 1084-7, 11-14 Sept. 2005	


井上 卓也, 植松 裕子, 斎藤 英雄: “高速カメラ映像を用いた硬式野
球ボールの回転速度推定システム”, 電学論D, Vol. 131, No. 4,
pp.608-615 (2011) .	
                                                                                  M. Yamamoto, "A General Aperture Problem for Direct Estimation of 3-D Motion Parameters,"
                                                                                  IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 11, no. 5, pp. 528-536,
                                                                                  May 1989, doi:10.1109/34.24785

                                                           Ken-ichi Kanatani, Structure and motion from optical flow under perspective projection, Computer Vision, Graphics,
                                                           and Image Processing, Volume 38, Issue 2, May 1987, Pages 122-146, ISSN 0734-189X, DOI: 10.1016/
                                                           S0734-189X(87)80133-0.
                                                                                                                                                                        6
                                                           Ken-ichi Kanatani, Structure and motion from optical flow under orthographic projection, Computer Vision, Graphics,
                                                           and Image Processing, Volume 35, Issue 2, August 1986, Pages 181-199, ISSN 0734-189X, DOI:
                                                           10.1016/0734-189X(86)90026-5.
An experimental system
                             Zoom capture	
            High speed camera	
                                                            500[fps]




Z axis	
                                                           Image processing	

ωz	
                        X axis	
 
                                        ■    Motion parameter estimation of
                    ωx	
 
            ωy	
                             a textured rigid object	
                                             n    Rotation: angles about each axis	
    Y axis	
 
                                             n    Translation	

                                                                                        7
Spin estimation
                       Toru Tamaki, Takahiko Sugino, Masanobu Yamamoto: "Measuring Ball Spin
                       by Image Registration," Proc. of FCV2004 ; the 10th Korea-Japan Joint
                       Workshop on Frontiers of Computer Vision, pp.269-274 (2004 2), Kyushu
                       University, Fukuoka, Japan, 2004/2/3-4.




      Expert	
                                                                          Beginner	




      6000                                                                                      6000
      5000                                                                        ωx            5000
                                                                                  ωy
      4000                                                                                      4000
                                                                                  ωz
      3000                                                                        ω             3000
      2000                                                                                      2000
      1000
                                                                                         rpm
rpm




                                                                                                1000
         0                                                                                         0
  -1000                                                                                        -1000
  -2000                                                                                        -2000
  -3000                                                                                        -3000
             31   32          33         34         35           36         37                         15   16     17   18   19   20   21
                        Frame number                                                                             Frame number               8
Motion model of a ball in 2 frames




  Optimized by Gauss-Newton method with 1D line search / coase-to-fine	
   9
Depth value from a sphere model




                              10
Improvements

■    Use orthographic projection
     n    Perspective projection is not necessary
     n    Ball is 40mm in diameter
     n    Distance between a ball and a camera is 3 to 5 m
■    Ignore Z translation
     n    Z value variation is at most 152.5cm (table width)
■    Use ball centers and radius
     n    Provided by manual or circle detection methods
■    Inverse Compositional Alignment (ICA)
     n    Precomputing Hessian and steepest decent images
           (Baker and Matthews, 2004)

                                                                11
Improvements

■    Use orthographic projection
     n    Perspective projection is not necessary
     n    Ball is 40mm in diameter
     n    Distance between a ball and a camera is 3 to 5 m
■    Ignore Z translation
     n    Z value variation is at most 152.5cm (table width)
■    Use ball centers and radius
     n    Provided by manual or circle detection methods
■    Inverse Compositional Alignment (ICA)
     n    Precomputing Hessian and steepest decent images
           (Baker and Matthews, 2004)

                                                                12
Orthographic projection
           4cm	
                                       ■    A ball is too small
                                             n     Distance: 3 to 5 m
  ball	
                                             n     Appearance difference is
                                                    less than a pixel.
                     3.999911110cm	
                                             n     Hence, ignore perspective!	
                   179.23[deg]	


                   0.76[deg]	
                                       Real scale	

3m	
                                                   ball	



                                                              3m	



camera	
                                            camera	
                                                                                    13
Improvements

■    Use orthographic projection
     n    Perspective projection is not necessary
     n    Ball is 40mm in diameter
     n    Distance between a ball and a camera is 3 to 5 m
■    Ignore Z translation
     n    Z value variation is at most 152.5cm (table width)
■    Use ball centers and radius
     n    Provided by manual or circle detection methods
■    Inverse Compositional Alignment (ICA)
     n    Precomputing Hessian and steepest decent images
           (Baker and Matthews, 2004)

                                                                14
Z variation
           4cm	
                                       ■     Table width is too small
                                               n    Z variation is at most ± 1m
  ball	
                                               n    Hence, give up for Z!	

                     3.999911110cm	
                         2.74m	

                   179.23[deg]	

                                            ball	
                                       1.575m	
                   0.76[deg]	

3m	
                                                             Official table size	




                                                           3m (about)	




camera	
                                       camera	
                                               15
Z variation
           4cm	
                                       ■     Table width is too small
                                               n    Z variation is at most ± 1m
  ball	
                                               n    Hence, give up for Z!	

                     3.999799995cm	
                             2.74m	




                                                           ± 1m
                   178.85[deg]	

                                            ball	
                                             1.575m	
                   1.15[deg]	




                                                           (about)	
2m	
                                                                   Official table size	




                                                           3m (about)	




camera	
                                       camera	
                                                     16
Z variation
           4cm	
                                       ■     Table width is too small
                                               n    Z variation is at most ± 1m
  ball	
                                               n    Hence, give up for Z!	

                     3.999911110cm	
                             2.74m	




                                                           ± 1m
                   179.23[deg]	

                                            ball	
                                             1.575m	
                   0.76[deg]	




                                                           (about)	
3m	
                                                                   Official table size	




                                                           3m (about)	




camera	
                                       camera	
                                                     17
Z variation
           4cm	
                                       ■     Table width is too small
                                               n    Z variation is at most ± 1m
  ball	
                                               n    Hence, give up for Z!	

                     3.999950000cm	
                             2.74m	




                                                           ± 1m
                   179.43[deg]	

                                            ball	
                                             1.575m	
                   0.57[deg]	




                                                           (about)	
4m	
                                                                   Official table size	




                                                           3m (about)	




camera	
                                       camera	
                                                     18
Computing area



                    Smaller area (x0.9 radius) is used for
                    computing intensity differences	



4cm	


                              Maximum rpm (round per minute): 10,000 rpm
                              = 0.2777… round per 1/600sec
           25.84[deg]         = 100 degree / frame (1/600 sec)	
           =2584rpm	



        3.6cm (x0.9)	



                                                                      19
Computing area



                    Smaller area (x0.7 radius) is used for
                    computing intensity differences	



4cm	


                              Maximum rpm (round per minute): 10,000 rpm
                              = 0.2777… round per 1/600sec
           45.57[deg]         = 100 degree / frame (1/600 sec)	
           =4557rpm	



        2.8cm (x0.7)	



                                                                      20
Improvements

■    Use orthographic projection
     n    Perspective projection is not necessary
     n    Ball is 40mm in diameter
     n    Distance between a ball and a camera is 3 to 5 m
■    Ignore Z translation
     n    Z value variation is at most 152.5cm (table width)
■    Use ball centers and radius
     n    Provided by manual or circle detection methods
■    Inverse Compositional Alignment (ICA)
     n    Precomputing Hessian and steepest decent images
           (Baker and Matthews, 2004)

                                                                21
Ball centers and radius
     ■    Provided by other methods
          n    Circle detection by Hough
                transform
          n    Manual operations by
                mouse clicking
          n    Tracking frame by frame	




                                            22
Improvements

■    Use orthographic projection
     n    Perspective projection is not necessary
     n    Ball is 40mm in diameter
     n    Distance between a ball and a camera is 3 to 5 m
■    Ignore Z translation
     n    Z value variation is at most 152.5cm (table width)
■    Use ball centers and radius
     n    Provided by manual or circle detection methods
■    Inverse Compositional Alignment (ICA)
     n    Precomputing Hessian and steepest decent images
           (Baker and Matthews, 2004)

                                                                23
ICA
                               ■    Previous approach                  (Tamaki et. al 2004)


                                    n    Forward additive approach
                                          (Baker and Matthews, 2004)


                                    n    Minimizing a cost function f
                                    n    Find parameter updates ∆p
                                    n    Update p+∆p→p


Previous approach	

                      p+∆p	




                                                                                        24
ICA
                               ■    Previous approach                  (Tamaki et. al 2004)


                                    n    Forward additive approach
                                          (Baker and Matthews, 2004)


                                    n    Minimizing a cost function f
                                    n    Find parameter updates ∆p
                                    n    Update p+∆p→p


                               ■    ICA approach
Previous approach	
                                    n    Inverse Compositional
                      p+∆p	
              approach (Baker and Matthews, 2004)
                                    n    Minimizing a cost function f
ICA approach	
                                    n    Find parameter updates ∆p
                                    n    Update

                                                                                        25
Advantages of ICA
                               ■    Pre-comutation
                                    n    Hessian, Steepest decent images




                               ■    ICA approach
Previous approach	
                                    n    Inverse Compositional
                      p+∆p	
                                          approach (Baker and Matthews, 2004)
                                    n    Minimizing a cost function f
ICA approach	
                                    n    Find parameter updates ∆p
                                    n    Update

                                                                            26
Warp update rule
X1=¢R(X—C)+¢T+C	
                                         Sphere model	


                        From 2D to 3D : 	




                                             Ball
                                             center	




                                                        27
Warp update rule
X1=¢R(X—C)+¢T+C	
    X2=R(X—C)+T+C	
                                                              Sphere model	


                                             From 2D to 3D : 	




                                                                  Ball
                                                                  center	



                    The updated motion parameters	



                                                                             28
Experiments with real sequences




                            432x192, 600fps	
                       (radius: 16 to 17 pixel)	




                                            29
Experiments




          30
Simulation
                                                                                     ■    Synthetic image sequence
                                                                                          n    with a 3D CG software
                                                                                          n    Orthographic projection
                                                                                     ■    Background: gray
                                                                                     ■    Texture: text
                                        Y rotation	
   X rotation	
   Z rotation	
   ■    Radius: 35 pixels
                                                                                          Rotation
                               6	
  

                               5	
  
                                                                                     ■ 
                               4	
  
                                                                                                5 degree about each of X,
Estimated angle [degree]	




                                                                                          n 
                               3	
  

                               2	
                                                              Y, and Z axes
                               1	
  

                               0	
  
                                                                                     ■    Translation: no
                             -­‐1	
  

                             -­‐2	
  

                             -­‐3	
  

                             -­‐4	
  

                             -­‐5	
  

                             -­‐6	
  
                                                                                                                            31
Visual inspection
                                      I1	
                     I2	




                                                                Radius: 61 pixels

          I1(x)—I2(w(x;p))	




        Difference between images             Warped image	
        (the brighter the closer)	
Cost function	
                                                                               32
Conclusion

■    Proposed an ICA-based ball spin estimation
     method
     n    Orthogonal projection
     n    Centers and radius given
     n    ICA approach

■    Future work
     n    Easy-to-use system design
     n    Accuracy/error analysis
     n    Fast implementation
     n    Ball detection/tracking	

                                                  33

More Related Content

PDF
20110326 CG・CVにおける散乱
PDF
20110415 Scattering in CG and CV
PDF
Shadow Detection and Removal in Still Images by using Hue Properties of Color...
PDF
Keynote Virtual Efficiency Congress 2012
DOCX
Motion capture document
PDF
Model User Calibration Free Remote Gaze Estimation System
PDF
Shadow Detection and Removal Techniques A Perspective View
20110326 CG・CVにおける散乱
20110415 Scattering in CG and CV
Shadow Detection and Removal in Still Images by using Hue Properties of Color...
Keynote Virtual Efficiency Congress 2012
Motion capture document
Model User Calibration Free Remote Gaze Estimation System
Shadow Detection and Removal Techniques A Perspective View

What's hot (20)

PDF
NIPS2009: Understand Visual Scenes - Part 2
PDF
GIS-Landslide Meeting お手軽地形測量 2011-11-26
PDF
Keynote at 23rd International Display Workshop
PDF
Motion analysis in video surveillance using edge detection techniques
PPTX
Active Strokes: Coherent Line Stylization for Animated 3D Models
PPTX
mihara_iccp16_presentation
PPSX
Exploring Methods to Improve Edge Detection with Canny Algorithm
PDF
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
PDF
2008 brokerage 03 scalable 3 d models [compatibility mode]
PPTX
Compressive Light Field Displays
PDF
DOCX
Fourier_Analysis
PDF
Scale and object aware image retargeting for thumbnail browsing
PDF
Color Img at Prisma Network meeting 2009
PPTX
Monocular simultaneous localization and generalized object mapping with undel...
PDF
Feature based ghost removal in high dynamic range imaging
DOC
Seminar report on edge detection of video using matlab code
PPTX
Edge detection using evolutionary algorithms new
PDF
Dr.Kawewong Ph.D Thesis
PPT
BMC 2012 - Invited Talk
NIPS2009: Understand Visual Scenes - Part 2
GIS-Landslide Meeting お手軽地形測量 2011-11-26
Keynote at 23rd International Display Workshop
Motion analysis in video surveillance using edge detection techniques
Active Strokes: Coherent Line Stylization for Animated 3D Models
mihara_iccp16_presentation
Exploring Methods to Improve Edge Detection with Canny Algorithm
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
2008 brokerage 03 scalable 3 d models [compatibility mode]
Compressive Light Field Displays
Fourier_Analysis
Scale and object aware image retargeting for thumbnail browsing
Color Img at Prisma Network meeting 2009
Monocular simultaneous localization and generalized object mapping with undel...
Feature based ghost removal in high dynamic range imaging
Seminar report on edge detection of video using matlab code
Edge detection using evolutionary algorithms new
Dr.Kawewong Ph.D Thesis
BMC 2012 - Invited Talk
Ad

Viewers also liked (8)

PDF
広島大学工学部における医用画像処理の取り組み:放射線科,眼科,内科
PDF
ICASSP2012 Poster Estimating the spin of a table tennis ball using inverse co...
PPTX
20120629PRMU CVPR2012報告
PDF
20090924 姿勢推定と回転行列
PDF
20110606PRMU 2D-3Dマッチングを用いた3次元点群の時間的な剛体変化検出
PDF
SSII2012 2D&3Dレジストレーション ~画像と3次元点群の合わせ方~ 第1部
PDF
3次元レジストレーション(PCLデモとコード付き)
広島大学工学部における医用画像処理の取り組み:放射線科,眼科,内科
ICASSP2012 Poster Estimating the spin of a table tennis ball using inverse co...
20120629PRMU CVPR2012報告
20090924 姿勢推定と回転行列
20110606PRMU 2D-3Dマッチングを用いた3次元点群の時間的な剛体変化検出
SSII2012 2D&3Dレジストレーション ~画像と3次元点群の合わせ方~ 第1部
3次元レジストレーション(PCLデモとコード付き)
Ad

Similar to 201109CVIM/PRMU Inverse Composite Alignment of a sphere under orthogonal projection for ball spin estimation (20)

PDF
A COMPARATIVE STUDY ON HUMAN ACTION RECOGNITION USING MULTIPLE SKELETAL FEATU...
PDF
A COMPARATIVE STUDY ON HUMAN ACTION RECOGNITION USING MULTIPLE SKELETAL FEATU...
PPTX
Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal...
PDF
Three dimensional biomechanical analysis of the drag in penalty corner drag f...
PPTX
Application of feature point matching to video stabilization
PDF
MOTION PREDICTION USING DEPTH INFORMATION OF HUMAN ARM BASED ON ALEXNET
PDF
Motion Prediction Using Depth Information of Human Arm Based on Alexnet
PDF
[IJET-V1I3P22] Authors :Dipali D. Deokar, Chandrasekhar G. Patil.
PDF
VR / AR for Medical Application (가상현실 / 증강현실의 의료 응용)
PPTX
ロボット手術を対象とした微細血管吻合シミュレータの開発
PDF
11.javelin throwing technique a biomechanical study
PDF
11.javelin throwing technique a biomechanical study
PDF
Javelin throwing technique a biomechanical study
PDF
Video Browsing By Direct Manipulation - Draft 1
PPTX
Computer Vision and GenAI for Geoscientists.pptx
PPTX
Computer Vision and GenAI for Geoscientists.pptx
PDF
Mutual Information for Registration of Monomodal Brain Images using Modified ...
PPTX
PPT
Motion Capture
PPTX
Quality assessment of 3 d
A COMPARATIVE STUDY ON HUMAN ACTION RECOGNITION USING MULTIPLE SKELETAL FEATU...
A COMPARATIVE STUDY ON HUMAN ACTION RECOGNITION USING MULTIPLE SKELETAL FEATU...
Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal...
Three dimensional biomechanical analysis of the drag in penalty corner drag f...
Application of feature point matching to video stabilization
MOTION PREDICTION USING DEPTH INFORMATION OF HUMAN ARM BASED ON ALEXNET
Motion Prediction Using Depth Information of Human Arm Based on Alexnet
[IJET-V1I3P22] Authors :Dipali D. Deokar, Chandrasekhar G. Patil.
VR / AR for Medical Application (가상현실 / 증강현실의 의료 응용)
ロボット手術を対象とした微細血管吻合シミュレータの開発
11.javelin throwing technique a biomechanical study
11.javelin throwing technique a biomechanical study
Javelin throwing technique a biomechanical study
Video Browsing By Direct Manipulation - Draft 1
Computer Vision and GenAI for Geoscientists.pptx
Computer Vision and GenAI for Geoscientists.pptx
Mutual Information for Registration of Monomodal Brain Images using Modified ...
Motion Capture
Quality assessment of 3 d

More from Toru Tamaki (20)

PDF
論文紹介:Unboxed: Geometrically and Temporally Consistent Video Outpainting
PDF
論文紹介:OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video​ Unde...
PDF
論文紹介:HOTR: End-to-End Human-Object Interaction Detection​ With Transformers, ...
PDF
論文紹介:Segment Anything, SAM2: Segment Anything in Images and Videos
PDF
論文紹介:Unbiasing through Textual Descriptions: Mitigating Representation Bias i...
PDF
論文紹介:AutoPrompt: Eliciting Knowledge from Language Models with Automatically ...
PDF
論文紹介:「Amodal Completion via Progressive Mixed Context Diffusion」「Amodal Insta...
PDF
論文紹介:「mPLUG-Owl3: Towards Long Image-Sequence Understanding in Multi-Modal La...
PDF
論文紹介:What, when, and where? ​Self-Supervised Spatio-Temporal Grounding​in Unt...
PDF
論文紹介:PitcherNet: Powering the Moneyball Evolution in Baseball Video Analytics
PDF
論文紹介:"Visual Genome:Connecting Language and Vision​Using Crowdsourced Dense I...
PDF
論文紹介:"InfLoRA: Interference-Free Low-Rank Adaptation for Continual Learning" ...
PDF
論文紹介:ActionSwitch: Class-agnostic Detection of Simultaneous Actions in Stream...
PDF
論文紹介:Make Pixels Dance: High-Dynamic Video Generation
PDF
PCSJ-IMPS2024招待講演「動作認識と動画像符号化」2024年度画像符号化シンポジウム(PCSJ 2024) 2024年度映像メディア処理シンポジ...
PDF
論文紹介:T-DEED: Temporal-Discriminability Enhancer Encoder-Decoder for Precise E...
PDF
論文紹介:On Feature Normalization and Data Augmentation
PDF
論文紹介:CLIFF: Continual Latent Diffusion for Open-Vocabulary Object Detection
PDF
論文紹介:MS-DETR: Efficient DETR Training with Mixed Supervision
PDF
論文紹介:Synergy of Sight and Semantics: Visual Intention Understanding with CLIP
論文紹介:Unboxed: Geometrically and Temporally Consistent Video Outpainting
論文紹介:OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video​ Unde...
論文紹介:HOTR: End-to-End Human-Object Interaction Detection​ With Transformers, ...
論文紹介:Segment Anything, SAM2: Segment Anything in Images and Videos
論文紹介:Unbiasing through Textual Descriptions: Mitigating Representation Bias i...
論文紹介:AutoPrompt: Eliciting Knowledge from Language Models with Automatically ...
論文紹介:「Amodal Completion via Progressive Mixed Context Diffusion」「Amodal Insta...
論文紹介:「mPLUG-Owl3: Towards Long Image-Sequence Understanding in Multi-Modal La...
論文紹介:What, when, and where? ​Self-Supervised Spatio-Temporal Grounding​in Unt...
論文紹介:PitcherNet: Powering the Moneyball Evolution in Baseball Video Analytics
論文紹介:"Visual Genome:Connecting Language and Vision​Using Crowdsourced Dense I...
論文紹介:"InfLoRA: Interference-Free Low-Rank Adaptation for Continual Learning" ...
論文紹介:ActionSwitch: Class-agnostic Detection of Simultaneous Actions in Stream...
論文紹介:Make Pixels Dance: High-Dynamic Video Generation
PCSJ-IMPS2024招待講演「動作認識と動画像符号化」2024年度画像符号化シンポジウム(PCSJ 2024) 2024年度映像メディア処理シンポジ...
論文紹介:T-DEED: Temporal-Discriminability Enhancer Encoder-Decoder for Precise E...
論文紹介:On Feature Normalization and Data Augmentation
論文紹介:CLIFF: Continual Latent Diffusion for Open-Vocabulary Object Detection
論文紹介:MS-DETR: Efficient DETR Training with Mixed Supervision
論文紹介:Synergy of Sight and Semantics: Visual Intention Understanding with CLIP

Recently uploaded (20)

PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Network Security Unit 5.pdf for BCA BBA.
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
cuic standard and advanced reporting.pdf
PDF
Encapsulation theory and applications.pdf
PDF
A comparative analysis of optical character recognition models for extracting...
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Approach and Philosophy of On baking technology
PPTX
A Presentation on Artificial Intelligence
Reach Out and Touch Someone: Haptics and Empathic Computing
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
NewMind AI Weekly Chronicles - August'25-Week II
Diabetes mellitus diagnosis method based random forest with bat algorithm
Network Security Unit 5.pdf for BCA BBA.
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
gpt5_lecture_notes_comprehensive_20250812015547.pdf
The Rise and Fall of 3GPP – Time for a Sabbatical?
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
cuic standard and advanced reporting.pdf
Encapsulation theory and applications.pdf
A comparative analysis of optical character recognition models for extracting...
Encapsulation_ Review paper, used for researhc scholars
MIND Revenue Release Quarter 2 2025 Press Release
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Assigned Numbers - 2025 - Bluetooth® Document
Spectral efficient network and resource selection model in 5G networks
Approach and Philosophy of On baking technology
A Presentation on Artificial Intelligence

201109CVIM/PRMU Inverse Composite Alignment of a sphere under orthogonal projection for ball spin estimation

  • 1. Compositional Inverse Composite Alignment of a sphere under orthogonal projection for ball spin estimation 玉木徹(広島大) 牛山幸彦(新潟大学教育人間科学部) Bisser Raytchev(広島大), 金田和文(広島大) 1
  • 2. 研究の背景 動画像処理技術 ●  バイオメカニクス ●  講習会・体育授業 ●  スキル判定 n  ビデオ撮影・上映 ●  チームプレイ情勢判断 n  指導者の直接指導 ●  放送映像可視化技術 多数の生徒に対する教師 少数の生徒に対する指導 や指導者による対面指導 生徒の自習 2
  • 3. スポーツ指導とスキル獲得 スキル獲得における3つのプロセス(工藤, 2000) 1.  全身協応動作の発現 n  動作を教え込まず、明確な課 練習者の課題となる 題を与え発現を待つこと 目標の提示 n  適切な時点に適切な熟練者 の指導言葉を与えること 2.  練習による洗練 適切な指導のための 資料の提示 n  新しい情報を与えすぎないこ と(過度の依存を避ける) 分析過程が見えない n  内在的フィードバックを重視 情報をフィードバック し、付加的フィードバックは補 しない 助的に用いること システムとして 3.  自動化 目指すもの 3
  • 4. Table Tennis 玉木徹, 牛山幸彦, 八坂剛史 : 「スポーツ選手の技能向上のための動画像処理とその実用化」, 電子情報通信学会技術報告 パターン認識・メディア理解研究会 PRMU2005-116, No.116, pp. 13-18, 広島市立大学, 広島 (2005 11). , ■  Factors of table tennis n  ball speed and spin ■  Ball spin (rotation) n  One of major factors that affects performance l  Average 8000 [rpm] for Chinese national team (Wu, 1992) n  Great effect on a rally l  Large angle change after the impact to a racket ■  Conventional measurement n  Counting by manual n  Official ball was changed in radius l  In 2000, 38mm to 40mm l  Effect investigated (Tang 2002, Kawazoe 2004) ●  Method: motion estimation from a video ●  Goal: feedback spin to a player Wu Huan Qun, Qin Zhifeng, Xu Shaofa, Xi Enting: Experimental Research in Table Tennis Spin,“ International Journal of Table Tennis Science, The International Table Tennis Federation, No. 1, pp.73-79 (1992 08). Hai-peng Tang, Masato Mizoguchi, Shintaro Toyoshima: Speed and spin characteristics of the 40mm table tennis ball," Table Tennis Sciences, No. 4&5, ITTF Sports Science Committee, pp.278{284 (2002) Y. Kawazoe, D. Suzuki, Comparison of the 40 and 38 mm table tennis balls in terms of impact with a racket based on predicted impact phenomena," in Science and Racket Sports III, pp.140{145 (2004). 4
  • 5. Related work color corner tracking, stereo B&W corner tracking (2D rotation) Alexander Szep, Quantifying Rotations of Spheric Objects, The Twelfth IAPR Conference on Machine Vision Applications (MVA2010), 2010. One shot estimation from motion blur Christian Theobalt, Irene Albrecht, Jörg Haber Haber, Marcus Magnor, and Hans- Peter Seidel. 2004. Pitching a baseball: tracking high-speed motion with multi- exposure images. In ACM SIGGRAPH 2004 Papers (SIGGRAPH '04), Joe Marks (Ed.). ACM, New York, NY, USA, 540-547. DOI=10.1145/1186562.1015758 Giacomo Boracchi and Vincenzo Caglioti and Alessandro Giusti. Estimation of 3d instantaneous motion of a ball from a single motion-blurred image. In VISIGRAPP, Computer Vision and Computer Graphics. Theory and Applications, Communications in Computer and Information Science, 2009, Volume 24, Part 3, Part 6, 225-237, DOI: 10.1007/978-3-642-10226-4_18 5
  • 6. Related work Parametric Motion parameter estimation Eigenspace method by minimizing intensity error with CG Shape and motion from optical flow Shum, H.; Komura, T.; , "Tracking the translational and rotational movement of the ball using high-speed camera movies," Image Processing, 2005. ICIP 2005. IEEE International Conference on , vol.3, no., pp. III- 1084-7, 11-14 Sept. 2005 井上 卓也, 植松 裕子, 斎藤 英雄: “高速カメラ映像を用いた硬式野 球ボールの回転速度推定システム”, 電学論D, Vol. 131, No. 4, pp.608-615 (2011) . M. Yamamoto, "A General Aperture Problem for Direct Estimation of 3-D Motion Parameters," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 11, no. 5, pp. 528-536, May 1989, doi:10.1109/34.24785 Ken-ichi Kanatani, Structure and motion from optical flow under perspective projection, Computer Vision, Graphics, and Image Processing, Volume 38, Issue 2, May 1987, Pages 122-146, ISSN 0734-189X, DOI: 10.1016/ S0734-189X(87)80133-0. 6 Ken-ichi Kanatani, Structure and motion from optical flow under orthographic projection, Computer Vision, Graphics, and Image Processing, Volume 35, Issue 2, August 1986, Pages 181-199, ISSN 0734-189X, DOI: 10.1016/0734-189X(86)90026-5.
  • 7. An experimental system Zoom capture High speed camera 500[fps] Z axis Image processing ωz X axis ■  Motion parameter estimation of ωx ωy a textured rigid object n  Rotation: angles about each axis Y axis n  Translation 7
  • 8. Spin estimation Toru Tamaki, Takahiko Sugino, Masanobu Yamamoto: "Measuring Ball Spin by Image Registration," Proc. of FCV2004 ; the 10th Korea-Japan Joint Workshop on Frontiers of Computer Vision, pp.269-274 (2004 2), Kyushu University, Fukuoka, Japan, 2004/2/3-4. Expert Beginner 6000 6000 5000 ωx 5000 ωy 4000 4000 ωz 3000 ω 3000 2000 2000 1000 rpm rpm 1000 0 0 -1000 -1000 -2000 -2000 -3000 -3000 31 32 33 34 35 36 37 15 16 17 18 19 20 21 Frame number Frame number 8
  • 9. Motion model of a ball in 2 frames Optimized by Gauss-Newton method with 1D line search / coase-to-fine 9
  • 10. Depth value from a sphere model 10
  • 11. Improvements ■  Use orthographic projection n  Perspective projection is not necessary n  Ball is 40mm in diameter n  Distance between a ball and a camera is 3 to 5 m ■  Ignore Z translation n  Z value variation is at most 152.5cm (table width) ■  Use ball centers and radius n  Provided by manual or circle detection methods ■  Inverse Compositional Alignment (ICA) n  Precomputing Hessian and steepest decent images (Baker and Matthews, 2004) 11
  • 12. Improvements ■  Use orthographic projection n  Perspective projection is not necessary n  Ball is 40mm in diameter n  Distance between a ball and a camera is 3 to 5 m ■  Ignore Z translation n  Z value variation is at most 152.5cm (table width) ■  Use ball centers and radius n  Provided by manual or circle detection methods ■  Inverse Compositional Alignment (ICA) n  Precomputing Hessian and steepest decent images (Baker and Matthews, 2004) 12
  • 13. Orthographic projection 4cm ■  A ball is too small n  Distance: 3 to 5 m ball n  Appearance difference is less than a pixel. 3.999911110cm n  Hence, ignore perspective! 179.23[deg] 0.76[deg] Real scale 3m ball 3m camera camera 13
  • 14. Improvements ■  Use orthographic projection n  Perspective projection is not necessary n  Ball is 40mm in diameter n  Distance between a ball and a camera is 3 to 5 m ■  Ignore Z translation n  Z value variation is at most 152.5cm (table width) ■  Use ball centers and radius n  Provided by manual or circle detection methods ■  Inverse Compositional Alignment (ICA) n  Precomputing Hessian and steepest decent images (Baker and Matthews, 2004) 14
  • 15. Z variation 4cm ■  Table width is too small n  Z variation is at most ± 1m ball n  Hence, give up for Z! 3.999911110cm 2.74m 179.23[deg] ball 1.575m 0.76[deg] 3m Official table size 3m (about) camera camera 15
  • 16. Z variation 4cm ■  Table width is too small n  Z variation is at most ± 1m ball n  Hence, give up for Z! 3.999799995cm 2.74m ± 1m 178.85[deg] ball 1.575m 1.15[deg] (about) 2m Official table size 3m (about) camera camera 16
  • 17. Z variation 4cm ■  Table width is too small n  Z variation is at most ± 1m ball n  Hence, give up for Z! 3.999911110cm 2.74m ± 1m 179.23[deg] ball 1.575m 0.76[deg] (about) 3m Official table size 3m (about) camera camera 17
  • 18. Z variation 4cm ■  Table width is too small n  Z variation is at most ± 1m ball n  Hence, give up for Z! 3.999950000cm 2.74m ± 1m 179.43[deg] ball 1.575m 0.57[deg] (about) 4m Official table size 3m (about) camera camera 18
  • 19. Computing area Smaller area (x0.9 radius) is used for computing intensity differences 4cm Maximum rpm (round per minute): 10,000 rpm = 0.2777… round per 1/600sec 25.84[deg] = 100 degree / frame (1/600 sec) =2584rpm 3.6cm (x0.9) 19
  • 20. Computing area Smaller area (x0.7 radius) is used for computing intensity differences 4cm Maximum rpm (round per minute): 10,000 rpm = 0.2777… round per 1/600sec 45.57[deg] = 100 degree / frame (1/600 sec) =4557rpm 2.8cm (x0.7) 20
  • 21. Improvements ■  Use orthographic projection n  Perspective projection is not necessary n  Ball is 40mm in diameter n  Distance between a ball and a camera is 3 to 5 m ■  Ignore Z translation n  Z value variation is at most 152.5cm (table width) ■  Use ball centers and radius n  Provided by manual or circle detection methods ■  Inverse Compositional Alignment (ICA) n  Precomputing Hessian and steepest decent images (Baker and Matthews, 2004) 21
  • 22. Ball centers and radius ■  Provided by other methods n  Circle detection by Hough transform n  Manual operations by mouse clicking n  Tracking frame by frame 22
  • 23. Improvements ■  Use orthographic projection n  Perspective projection is not necessary n  Ball is 40mm in diameter n  Distance between a ball and a camera is 3 to 5 m ■  Ignore Z translation n  Z value variation is at most 152.5cm (table width) ■  Use ball centers and radius n  Provided by manual or circle detection methods ■  Inverse Compositional Alignment (ICA) n  Precomputing Hessian and steepest decent images (Baker and Matthews, 2004) 23
  • 24. ICA ■  Previous approach (Tamaki et. al 2004) n  Forward additive approach (Baker and Matthews, 2004) n  Minimizing a cost function f n  Find parameter updates ∆p n  Update p+∆p→p Previous approach p+∆p 24
  • 25. ICA ■  Previous approach (Tamaki et. al 2004) n  Forward additive approach (Baker and Matthews, 2004) n  Minimizing a cost function f n  Find parameter updates ∆p n  Update p+∆p→p ■  ICA approach Previous approach n  Inverse Compositional p+∆p approach (Baker and Matthews, 2004) n  Minimizing a cost function f ICA approach n  Find parameter updates ∆p n  Update 25
  • 26. Advantages of ICA ■  Pre-comutation n  Hessian, Steepest decent images ■  ICA approach Previous approach n  Inverse Compositional p+∆p approach (Baker and Matthews, 2004) n  Minimizing a cost function f ICA approach n  Find parameter updates ∆p n  Update 26
  • 27. Warp update rule X1=¢R(X—C)+¢T+C Sphere model From 2D to 3D : Ball center 27
  • 28. Warp update rule X1=¢R(X—C)+¢T+C X2=R(X—C)+T+C Sphere model From 2D to 3D : Ball center The updated motion parameters 28
  • 29. Experiments with real sequences 432x192, 600fps (radius: 16 to 17 pixel) 29
  • 31. Simulation ■  Synthetic image sequence n  with a 3D CG software n  Orthographic projection ■  Background: gray ■  Texture: text Y rotation X rotation Z rotation ■  Radius: 35 pixels Rotation 6   5   ■  4   5 degree about each of X, Estimated angle [degree] n  3   2   Y, and Z axes 1   0   ■  Translation: no -­‐1   -­‐2   -­‐3   -­‐4   -­‐5   -­‐6   31
  • 32. Visual inspection I1 I2 Radius: 61 pixels I1(x)—I2(w(x;p)) Difference between images Warped image (the brighter the closer) Cost function 32
  • 33. Conclusion ■  Proposed an ICA-based ball spin estimation method n  Orthogonal projection n  Centers and radius given n  ICA approach ■  Future work n  Easy-to-use system design n  Accuracy/error analysis n  Fast implementation n  Ball detection/tracking 33