SlideShare a Scribd company logo
Maher Nadar Computer Vision Final Project 12/06/2106
1
Maher Nadar 12/06/2016
Computer vision Final project
Camera/Pipe orientation extraction using mathematical methods
And EPnP
Abstract
This paper talks about a potential solution that would allow a camera (possibly fixed on a drone)
to retrieve its relative position with respect to a pipe that has been covered by a certain known
and easily observable pattern. Starting with close-up images of the pipe in question, the first step
is to binarize them. A ‘Prewitt’ detector isolates the edges in the resulting image. Next, the borders
of the pipe are localised by use of a ‘Hough Transform, thus segregating the region of interest.
Changing the Region of interest in the original image to HSV and applying a homography (imopen)
with the ‘value’ component of that colour representation returns the black prospect black dots.
After that, the obtained dots are filtered and recognized as the dots of our pattern (the 2D
coordinates of the dots is in hand). Knowing the 3D coordinates, the camera/pipe pose is finally
obtained by use of EPnP algorithm.
INTRODUCTION AND MOTIVATION
In the world of Oil and Gas, there is no doubt that the cost of maintenance of pipelines is remarkably
high. No doubt about it, scanning all the vast stretches of the pipes in usually very harsh
environment conditions to pinpoint the deteriorated areas is a tedious task not willingly taken by
humans.
With the emergence of the drones and their affordability, people are now able to skim through acres
and acres of territory per day with very little effort. Indeed, a drone equipped with the right cameras
is nowadays able to detect moisture, motion behind opaque objects and much more. Having said
that, this study presents a possible way for the drone to know its position with respect to the pipes
while hovering closely to them in case the drone might need to further interact with the pipe it is
facing.
Maher Nadar Computer Vision Final Project 12/06/2106
2
Due to the fact that the intended methodology was to be applied using monocular imaging with
singular frame analysis at a time, the methods that were considered for the presented task were
‘Direct Linear Transformation and Perspective_3_Points.
Direct Linear Transformation (DLT)
In this algorithm, correspondences between 3D and 2D points are represented within a matrix
2nx12, where n is the number of correspondences detected. One has to note here that the 3D
points’ relation to each other is known beforehand (see figure 1).
In the following matrix representation, the P
vector is nothing but the eigenvector of matrix M
corresponding to the smallest eigenvalue.
Assuming that the A matrix containing the camera’s intrinsic parameters is known, and converting
the P vector above into a matrix of 3x4, the transformation matrix enclosing the rotation an
translation can be calculated as such:
Perspective_3_points (P3P)
The simplest form of the PnP methodology is when n=3 (i.e. 3 points correlation). However, 3 points
alone would give us several solutions, which would require a 4th
point is usually used in order to
avoid ambiguity.
Problem formulation
P: camera centre of projection; A,B and C: 3D points; u,v and w: 2D projections
X: |PA|; Y: |PB|; Z: |PC|; α: angle BPC; β: angle APC; ϒ: angle APB
p: 2cos α; q: 2cos β; r: 2cos ϒ; a’: |AB|; b’: |BC|; c’: |AC|;
Figure 1: DLT points correspondences
M
Maher Nadar Computer Vision Final Project 12/06/2106
3
 From the obtained triangles PBC, PAC and PAB, we get the following set of P3P equations:
Normalizing the image points and solving the above set of equations would get us four potential
solutions for our rotation and translation matrices R and T. A fourth point is then introduced in order
to select the best solution.
Effective Perspective_n_points (EPnP)
The Effective Perspective-n-points techniques is inspired by the normal PnP one, but allows beyond
four correspondences with negligible extra computational cost, if any. The main concept behind this
technique is that the coordinates of the 3D points are expressed with respect to 4 virtual points (one
point being the centroid of the points, and the other 3 forming a basis along the principal directions
of the data, as such: , where pi are the actual points and the cj are the
virtual control points.
With the calibration matrix in hand, the correspondence relation between 3D and 2D coordinates becomes:
which can now be expressed as
where wi are scalar projective parameters, which can be expressed, according to the last row as
The only unknowns left here are the 12 coordinates of the control points. Replacing the wi values in
rows 1 and 2 would give us two linear equations for each correspondence, resulting in a system of
the form , where M is a 2nx12 matrix and x is the 12x1 vector of unknowns (the control
points coordinates).
The solution is nothing but , with vi being the columns of the eigenvectors of the smallest
eigen values of M corresponding to the N (varying from 1 to 4) null singular values of M. Finally, in
order to calculate the right weights, the solutions for all 4 values of N are computed, and the
solution with the least reprojection error is retained.
Figure 2: P3P 2-points relation
Maher Nadar Computer Vision Final Project 12/06/2106
4
METHODOLOGY
Camera Calibration
The first step to do is surely to calibrate the camera at hand. Using the calibration app figuring inside
the MatLAB program, the camera intrinsic parameters are acquired. 20 pictures of the usual
calibrating checkerboard are taken and inputted in the application. In Figure 3, an example of one of
the images’ automatic pre-processing before calibration is displayed.
At the end of the calibration process, the application displays the 3D re-projection of all the images
in the frame of the camera (displayed in Figure 4), an overall mean error for every image, and,
surely, the camera parameters.
Calibration Results
Focal Length: [1.5245e+03 1.5249e+03]
Principal Point: [614.5443 530.7807]
Thus, the A calibration matrix is
Automatic 2D-3D correspondence acquisition
In this paper, the pose recuperation method that will be used is the Effective Perspective-n-Points
(EPnP). Similarly to the studied techniques mentioned above, correspondences between 3D
coordinates of specific points in the object in question and their 2D projections on the image frame
need to be established before application of the pose retrieval algorithm.
Figure 3: a) original checkerboard image. b) image after processing
Figure 4: calibration re-projection
Maher Nadar Computer Vision Final Project 12/06/2106
5
Due to the fact that the pipes are theoretically expected to display relatively few and unreliable
features (especially if their surface is highly reflective), there is a need to project on the pipe a
pattern that would aid the feature extraction process – projecting this pattern on the pipe is possible
through the use of flexible magnets but it is not within the scope of this paper.
The chosen pattern (Figure 5) would be wrapped around the pipe such that the line formed
by points 1,2 and 3 would be parallel to the axis of the pipe. Naturally, the dimension
between points 3 and 6 would be shortened when the pattern is curled around the round
surface of the pipe. In Figure 6, the chord RQ represents the new dimension. Setting the arc
dimension ‘a’ as 32.8 mm (initial dimension), the chord length would be calculated as such:
where ‘t’ is the angle formed by the arc:
Thus, the new obtained dimension, knowing the pipe radius (55 mm in this case) is 37.7188 mm.
3D coordinates
For simplicity, the world coordinate system where the Z = 0 plane is the plane containing all the
projected dots (i.e. the plane whose projection in the cross-sectional cut of the pipe would be the
chord RQ in Figure 6). The following are the calculated 3D coordinates, taking point 1 as the origin:
2D coordinates
The main challenge in this paper is to automatically detect the 2D coordinates of the 6 points and to
assign each of them to the correct correspondence point in the world frame. The series of image
processing to do so will be described next.
Figure 5: Proposed Pattern
Figure 6: RQ chord length calculation
Maher Nadar Computer Vision Final Project 12/06/2106
6
A series of 8 pictures taken of the pipe with the pattern wrapped around it are considered for this
study. Although all the pictures generate a successful output, the process description will be based
on one example picture out of the set.
Starting with the original close-up image (Figure 7.a), the first step is to perform a binarization. A
‘Prewitt’ detector isolates the edges in the resulting image (Figure 7.b). Next, the borders of the pipe
are localised by use of a ‘Hough Transform’ (Figure 7.c).
Following that, the region of interest, which is the pipe pixels isolation is obtained by extending the
‘hough-lines’ to the edges of the image and setting the pixels outside this region as white (Figure
8.a). The ROI is then transformed into the HSV and an ‘imopen’ morphological operation with a disk
kernel is applied on the ‘V’ matrix of this colour representation. In order to search for prospect black
dots (Figure 8.b). As can be observed, many unwanted black dots are detected on the border of the
ROI. In order to filter them, we calculate the distances between each dot and the 2 lines forming the
border of the ROI and eliminate the ones with a value less than a threshold. Finally, the 6 target dots
are isolated (Figure 8.c).
Figure 7: a) original image b) Binarization + edge detection c) Line detection (Hough Transform)
Figure 8: a) ROI b) black dots detection c) Black dots after filtering
Maher Nadar Computer Vision Final Project 12/06/2106
7
Up to this point, the 6 dots of the pattern are isolated in the image. The next step is to assign each of
them to their corresponding 3D match. In order to do so, the proposed idea is to generate a new
reference frame within the image that would help in distinguishing the dots. A convenient frame
would be the one formed by the bisector of the pipe and with the line orthogonal to it. For even
more convenience, the origin is chosen to be at the edge of the image. In Figure 9.a, the normal
image frame reference is represented as {x,y} in black, whereas the newly chosen frame is
represented as {x’,y’} in red.
The following tables represent the steps taken in order to assign the 6 different detected dots to
their 3D match. In the left most table, the dots coordinates in the image frame are present, but, as
mentioned before, they are still unidentifiable. In the 2nd
table, the same dots (in the same order)
have went through a coordinate’s transformation. By comparing these last values to the order of the
dots set in the pattern in Figure 5, the dots coordinates are assigned the right dot numbering. Last
but not least, the coordinates of the numbered dots are taken back to the image frame.
Figure 9: a) New reference frame b) Final dots 2D-3D matching
Figure 10: 2D coordinates correspondences
Maher Nadar Computer Vision Final Project 12/06/2106
8
Pose retrieval through EPnP algorithm
Now that the 2D coordinates, the 3d coordinates and the camera calibration matrix are in hand, the
EPnP function can be used in order to obtain the Rotation Matrix, the translation Matrix and the
position of the dots in the camera reference.
RESULTS AND DISCUSSION
The obtained dots coordinates in the Camera frame seem to be in accordance with the real distance
between the camera and the pipe when the picture was taken.
Displayed herein are the results of two other example pictures to further authenticate the
robustness of the code:
Maher Nadar Computer Vision Final Project 12/06/2106
9
CONCLUSION
In this paper, a technique to estimate the camera to pipe pose has been proposed and successfully
applied to a set of 8 close-up images of the pipe in question along with a chosen pattern used to
better the feature extraction. After a sequence of image processing, the six dots present in the
pattern are isolated and assigned each a number corresponding to their match in 3D. Having the 3D
coordinates (known from the pattern dimensions) and the Camera intrinsics Matrix, the pose of the
points with respect to the camera frame was calculated using the EPnP function in MatLAB.
Limitations and Future Work
Although the algorithm is pretty robust in its correspondence matching and pose estimation from a
given set of images, it is however not able to differentiate whether or not the pattern in the image is
shot in the upright or the flipped side. Indeed, the output of this code is one of the two possible
poses that can be obtained. A possible solution to this issue is to choose a pattern where an extra
dot would be located on the upper or lower side of the pattern.
Another limitation to this algorithm is that the code cannot calculate the camera to pipe pose unless
the dots are detected, which implies that the camera should be relatively close to the pipe in order
to obtain the required results. This could also be solved with a smart choice of pattern. For instance,
the pattern could contain extra dots with different colour (e.g. red) so as to not hinder the black dots
detection algorithm. These extra dots would also be in a bigger size, which would allow detection
from a further distance between camera and pipe. Hence, the red dots would be used for relatively
far positions, and when the camera gets closer, the search for the smaller black dots begins.
END
Maher Nadar Computer Vision Final Project 12/06/2106
10
REFERENCES
V. Lepetit, F. Moreno-Noguer and P. Fua. EPnP: An Accurate O(n) Solution to the PnP
Problem, in International Journal Of Computer Vision, vol. 81, p. 155-166, 2009.
Jensen, Jeppe. "Hough Transform for Straight Lines" (PDF). Retrieved 16
December 2011.
Gao, Xiao-Shan; Hou, Xiao-Rong; Tang, Jianliang; Cheng, Hang-Fei (2003). "Complete
Solution Classification for the Perspective-Three-Point Problem". IEEE Transactions on
Pattern Analysis and Machine Intelligence 25 (8): 930–943
Penny. (2009, 09). Question from Wayne. Retrieved from http://mathcentral.uregina.ca/:
http://mathcentral.uregina.ca/QQ/database/QQ.09.09/h/wayne1.html

More Related Content

PDF
Report bep thomas_blanken
PDF
06466595
PDF
Local binary pattern
PDF
Distortion Correction Scheme for Multiresolution Camera Images
PDF
Building 3D Morphable Models from 2D Images
DOCX
LEARNING FINGERPRINT RECONSTRUCTION: FROM MINUTIAE TO IMAGE
PDF
Iaetsd an enhanced circular detection technique rpsw using circular hough t...
PDF
Iaetsd a modified image fusion approach using guided filter
Report bep thomas_blanken
06466595
Local binary pattern
Distortion Correction Scheme for Multiresolution Camera Images
Building 3D Morphable Models from 2D Images
LEARNING FINGERPRINT RECONSTRUCTION: FROM MINUTIAE TO IMAGE
Iaetsd an enhanced circular detection technique rpsw using circular hough t...
Iaetsd a modified image fusion approach using guided filter

What's hot (17)

PPT
regions
PDF
44 paper
PDF
Image Segmentation
PDF
A Hough Transform Implementation for Line Detection for a Mobile Robot Self-N...
PDF
Unconstrained Optimization Method to Design Two Channel Quadrature Mirror Fil...
PDF
Camera Calibration from a Single Image based on Coupled Line Cameras and Rect...
PPTX
Automatic left ventricle segmentation
PPTX
Presentation for korea multimedia(in english)
PDF
Time Multiplexed VLSI Architecture for Real-Time Barrel Distortion Correction...
PDF
Segmentation of Color Image using Adaptive Thresholding and Masking with Wate...
PDF
www.ijerd.com
PDF
VEHICLE RECOGNITION USING VIBE AND SVM
PDF
APPLYING R-SPATIOGRAM IN OBJECT TRACKING FOR OCCLUSION HANDLING
PDF
Estimation of impervious surface based on integrated analysis of classificati...
PDF
FACE RECOGNITION ALGORITHM BASED ON ORIENTATION HISTOGRAM OF HOUGH PEAKS
PDF
3-d interpretation from single 2-d image for autonomous driving
PDF
New approach to the identification of the easy expression recognition system ...
regions
44 paper
Image Segmentation
A Hough Transform Implementation for Line Detection for a Mobile Robot Self-N...
Unconstrained Optimization Method to Design Two Channel Quadrature Mirror Fil...
Camera Calibration from a Single Image based on Coupled Line Cameras and Rect...
Automatic left ventricle segmentation
Presentation for korea multimedia(in english)
Time Multiplexed VLSI Architecture for Real-Time Barrel Distortion Correction...
Segmentation of Color Image using Adaptive Thresholding and Masking with Wate...
www.ijerd.com
VEHICLE RECOGNITION USING VIBE AND SVM
APPLYING R-SPATIOGRAM IN OBJECT TRACKING FOR OCCLUSION HANDLING
Estimation of impervious surface based on integrated analysis of classificati...
FACE RECOGNITION ALGORITHM BASED ON ORIENTATION HISTOGRAM OF HOUGH PEAKS
3-d interpretation from single 2-d image for autonomous driving
New approach to the identification of the easy expression recognition system ...
Ad

Viewers also liked (11)

DOCX
Malware protection system
PDF
Malicious File for Exploiting Forensic Software
PPTX
Anomalies Detection: Windows OS - Part 1
PDF
Volatile IOCs for Fast Incident Response
PPTX
List of Malwares
PPTX
REMnux Tutorial-3: Investigation of Malicious PDF & Doc documents
PPTX
REMnux tutorial-2: Extraction and decoding of Artifacts
PPTX
Malware Analysis and Defeating using Virtual Machines
PPTX
Remnux tutorial-1 Statically Analyse Portable Executable(PE) Files
PDF
Windows Memory Forensic Analysis using EnCase
PPTX
REMnux tutorial 4.1 - Datagrams, Fragmentation & Anomalies
Malware protection system
Malicious File for Exploiting Forensic Software
Anomalies Detection: Windows OS - Part 1
Volatile IOCs for Fast Incident Response
List of Malwares
REMnux Tutorial-3: Investigation of Malicious PDF & Doc documents
REMnux tutorial-2: Extraction and decoding of Artifacts
Malware Analysis and Defeating using Virtual Machines
Remnux tutorial-1 Statically Analyse Portable Executable(PE) Files
Windows Memory Forensic Analysis using EnCase
REMnux tutorial 4.1 - Datagrams, Fragmentation & Anomalies
Ad

Similar to Final Project Report Nadar (20)

PDF
Slides: Perspective click-and-drag area selections in pictures
PDF
IRJET- 3D Object Recognition of Car Image Detection
PPTX
2015-07-08 Paper 38 - ICVS Talk
PDF
Real-time 3D Object Pose Estimation and Tracking for Natural Landmark Based V...
PDF
posterfinal
PDF
final_report
PDF
Object tracking by dtcwt feature vectors 2-3-4
PDF
Ijcga1
PDF
ONE-DIMENSIONAL SIGNATURE REPRESENTATION FOR THREE-DIMENSIONAL CONVEX OBJECT ...
PDF
Visual Odometry using Stereo Vision
PDF
3-D Visual Reconstruction: A System Perspective
PDF
Defending thesis (english)
PDF
final_presentation
PDF
CVPR 2012 Review Seminar - Multi-View Hair Capture using Orientation Fields
PPT
Cameras
PDF
3D Reconstruction from Multiple uncalibrated 2D Images of an Object
PPTX
Computer Vision - Single View
PPT
Recovering 3D human body configurations using shape contexts
PDF
Journey to structure from motion
PDF
Honours_Thesis2015_final
Slides: Perspective click-and-drag area selections in pictures
IRJET- 3D Object Recognition of Car Image Detection
2015-07-08 Paper 38 - ICVS Talk
Real-time 3D Object Pose Estimation and Tracking for Natural Landmark Based V...
posterfinal
final_report
Object tracking by dtcwt feature vectors 2-3-4
Ijcga1
ONE-DIMENSIONAL SIGNATURE REPRESENTATION FOR THREE-DIMENSIONAL CONVEX OBJECT ...
Visual Odometry using Stereo Vision
3-D Visual Reconstruction: A System Perspective
Defending thesis (english)
final_presentation
CVPR 2012 Review Seminar - Multi-View Hair Capture using Orientation Fields
Cameras
3D Reconstruction from Multiple uncalibrated 2D Images of an Object
Computer Vision - Single View
Recovering 3D human body configurations using shape contexts
Journey to structure from motion
Honours_Thesis2015_final

Final Project Report Nadar

  • 1. Maher Nadar Computer Vision Final Project 12/06/2106 1 Maher Nadar 12/06/2016 Computer vision Final project Camera/Pipe orientation extraction using mathematical methods And EPnP Abstract This paper talks about a potential solution that would allow a camera (possibly fixed on a drone) to retrieve its relative position with respect to a pipe that has been covered by a certain known and easily observable pattern. Starting with close-up images of the pipe in question, the first step is to binarize them. A ‘Prewitt’ detector isolates the edges in the resulting image. Next, the borders of the pipe are localised by use of a ‘Hough Transform, thus segregating the region of interest. Changing the Region of interest in the original image to HSV and applying a homography (imopen) with the ‘value’ component of that colour representation returns the black prospect black dots. After that, the obtained dots are filtered and recognized as the dots of our pattern (the 2D coordinates of the dots is in hand). Knowing the 3D coordinates, the camera/pipe pose is finally obtained by use of EPnP algorithm. INTRODUCTION AND MOTIVATION In the world of Oil and Gas, there is no doubt that the cost of maintenance of pipelines is remarkably high. No doubt about it, scanning all the vast stretches of the pipes in usually very harsh environment conditions to pinpoint the deteriorated areas is a tedious task not willingly taken by humans. With the emergence of the drones and their affordability, people are now able to skim through acres and acres of territory per day with very little effort. Indeed, a drone equipped with the right cameras is nowadays able to detect moisture, motion behind opaque objects and much more. Having said that, this study presents a possible way for the drone to know its position with respect to the pipes while hovering closely to them in case the drone might need to further interact with the pipe it is facing.
  • 2. Maher Nadar Computer Vision Final Project 12/06/2106 2 Due to the fact that the intended methodology was to be applied using monocular imaging with singular frame analysis at a time, the methods that were considered for the presented task were ‘Direct Linear Transformation and Perspective_3_Points. Direct Linear Transformation (DLT) In this algorithm, correspondences between 3D and 2D points are represented within a matrix 2nx12, where n is the number of correspondences detected. One has to note here that the 3D points’ relation to each other is known beforehand (see figure 1). In the following matrix representation, the P vector is nothing but the eigenvector of matrix M corresponding to the smallest eigenvalue. Assuming that the A matrix containing the camera’s intrinsic parameters is known, and converting the P vector above into a matrix of 3x4, the transformation matrix enclosing the rotation an translation can be calculated as such: Perspective_3_points (P3P) The simplest form of the PnP methodology is when n=3 (i.e. 3 points correlation). However, 3 points alone would give us several solutions, which would require a 4th point is usually used in order to avoid ambiguity. Problem formulation P: camera centre of projection; A,B and C: 3D points; u,v and w: 2D projections X: |PA|; Y: |PB|; Z: |PC|; α: angle BPC; β: angle APC; ϒ: angle APB p: 2cos α; q: 2cos β; r: 2cos ϒ; a’: |AB|; b’: |BC|; c’: |AC|; Figure 1: DLT points correspondences M
  • 3. Maher Nadar Computer Vision Final Project 12/06/2106 3  From the obtained triangles PBC, PAC and PAB, we get the following set of P3P equations: Normalizing the image points and solving the above set of equations would get us four potential solutions for our rotation and translation matrices R and T. A fourth point is then introduced in order to select the best solution. Effective Perspective_n_points (EPnP) The Effective Perspective-n-points techniques is inspired by the normal PnP one, but allows beyond four correspondences with negligible extra computational cost, if any. The main concept behind this technique is that the coordinates of the 3D points are expressed with respect to 4 virtual points (one point being the centroid of the points, and the other 3 forming a basis along the principal directions of the data, as such: , where pi are the actual points and the cj are the virtual control points. With the calibration matrix in hand, the correspondence relation between 3D and 2D coordinates becomes: which can now be expressed as where wi are scalar projective parameters, which can be expressed, according to the last row as The only unknowns left here are the 12 coordinates of the control points. Replacing the wi values in rows 1 and 2 would give us two linear equations for each correspondence, resulting in a system of the form , where M is a 2nx12 matrix and x is the 12x1 vector of unknowns (the control points coordinates). The solution is nothing but , with vi being the columns of the eigenvectors of the smallest eigen values of M corresponding to the N (varying from 1 to 4) null singular values of M. Finally, in order to calculate the right weights, the solutions for all 4 values of N are computed, and the solution with the least reprojection error is retained. Figure 2: P3P 2-points relation
  • 4. Maher Nadar Computer Vision Final Project 12/06/2106 4 METHODOLOGY Camera Calibration The first step to do is surely to calibrate the camera at hand. Using the calibration app figuring inside the MatLAB program, the camera intrinsic parameters are acquired. 20 pictures of the usual calibrating checkerboard are taken and inputted in the application. In Figure 3, an example of one of the images’ automatic pre-processing before calibration is displayed. At the end of the calibration process, the application displays the 3D re-projection of all the images in the frame of the camera (displayed in Figure 4), an overall mean error for every image, and, surely, the camera parameters. Calibration Results Focal Length: [1.5245e+03 1.5249e+03] Principal Point: [614.5443 530.7807] Thus, the A calibration matrix is Automatic 2D-3D correspondence acquisition In this paper, the pose recuperation method that will be used is the Effective Perspective-n-Points (EPnP). Similarly to the studied techniques mentioned above, correspondences between 3D coordinates of specific points in the object in question and their 2D projections on the image frame need to be established before application of the pose retrieval algorithm. Figure 3: a) original checkerboard image. b) image after processing Figure 4: calibration re-projection
  • 5. Maher Nadar Computer Vision Final Project 12/06/2106 5 Due to the fact that the pipes are theoretically expected to display relatively few and unreliable features (especially if their surface is highly reflective), there is a need to project on the pipe a pattern that would aid the feature extraction process – projecting this pattern on the pipe is possible through the use of flexible magnets but it is not within the scope of this paper. The chosen pattern (Figure 5) would be wrapped around the pipe such that the line formed by points 1,2 and 3 would be parallel to the axis of the pipe. Naturally, the dimension between points 3 and 6 would be shortened when the pattern is curled around the round surface of the pipe. In Figure 6, the chord RQ represents the new dimension. Setting the arc dimension ‘a’ as 32.8 mm (initial dimension), the chord length would be calculated as such: where ‘t’ is the angle formed by the arc: Thus, the new obtained dimension, knowing the pipe radius (55 mm in this case) is 37.7188 mm. 3D coordinates For simplicity, the world coordinate system where the Z = 0 plane is the plane containing all the projected dots (i.e. the plane whose projection in the cross-sectional cut of the pipe would be the chord RQ in Figure 6). The following are the calculated 3D coordinates, taking point 1 as the origin: 2D coordinates The main challenge in this paper is to automatically detect the 2D coordinates of the 6 points and to assign each of them to the correct correspondence point in the world frame. The series of image processing to do so will be described next. Figure 5: Proposed Pattern Figure 6: RQ chord length calculation
  • 6. Maher Nadar Computer Vision Final Project 12/06/2106 6 A series of 8 pictures taken of the pipe with the pattern wrapped around it are considered for this study. Although all the pictures generate a successful output, the process description will be based on one example picture out of the set. Starting with the original close-up image (Figure 7.a), the first step is to perform a binarization. A ‘Prewitt’ detector isolates the edges in the resulting image (Figure 7.b). Next, the borders of the pipe are localised by use of a ‘Hough Transform’ (Figure 7.c). Following that, the region of interest, which is the pipe pixels isolation is obtained by extending the ‘hough-lines’ to the edges of the image and setting the pixels outside this region as white (Figure 8.a). The ROI is then transformed into the HSV and an ‘imopen’ morphological operation with a disk kernel is applied on the ‘V’ matrix of this colour representation. In order to search for prospect black dots (Figure 8.b). As can be observed, many unwanted black dots are detected on the border of the ROI. In order to filter them, we calculate the distances between each dot and the 2 lines forming the border of the ROI and eliminate the ones with a value less than a threshold. Finally, the 6 target dots are isolated (Figure 8.c). Figure 7: a) original image b) Binarization + edge detection c) Line detection (Hough Transform) Figure 8: a) ROI b) black dots detection c) Black dots after filtering
  • 7. Maher Nadar Computer Vision Final Project 12/06/2106 7 Up to this point, the 6 dots of the pattern are isolated in the image. The next step is to assign each of them to their corresponding 3D match. In order to do so, the proposed idea is to generate a new reference frame within the image that would help in distinguishing the dots. A convenient frame would be the one formed by the bisector of the pipe and with the line orthogonal to it. For even more convenience, the origin is chosen to be at the edge of the image. In Figure 9.a, the normal image frame reference is represented as {x,y} in black, whereas the newly chosen frame is represented as {x’,y’} in red. The following tables represent the steps taken in order to assign the 6 different detected dots to their 3D match. In the left most table, the dots coordinates in the image frame are present, but, as mentioned before, they are still unidentifiable. In the 2nd table, the same dots (in the same order) have went through a coordinate’s transformation. By comparing these last values to the order of the dots set in the pattern in Figure 5, the dots coordinates are assigned the right dot numbering. Last but not least, the coordinates of the numbered dots are taken back to the image frame. Figure 9: a) New reference frame b) Final dots 2D-3D matching Figure 10: 2D coordinates correspondences
  • 8. Maher Nadar Computer Vision Final Project 12/06/2106 8 Pose retrieval through EPnP algorithm Now that the 2D coordinates, the 3d coordinates and the camera calibration matrix are in hand, the EPnP function can be used in order to obtain the Rotation Matrix, the translation Matrix and the position of the dots in the camera reference. RESULTS AND DISCUSSION The obtained dots coordinates in the Camera frame seem to be in accordance with the real distance between the camera and the pipe when the picture was taken. Displayed herein are the results of two other example pictures to further authenticate the robustness of the code:
  • 9. Maher Nadar Computer Vision Final Project 12/06/2106 9 CONCLUSION In this paper, a technique to estimate the camera to pipe pose has been proposed and successfully applied to a set of 8 close-up images of the pipe in question along with a chosen pattern used to better the feature extraction. After a sequence of image processing, the six dots present in the pattern are isolated and assigned each a number corresponding to their match in 3D. Having the 3D coordinates (known from the pattern dimensions) and the Camera intrinsics Matrix, the pose of the points with respect to the camera frame was calculated using the EPnP function in MatLAB. Limitations and Future Work Although the algorithm is pretty robust in its correspondence matching and pose estimation from a given set of images, it is however not able to differentiate whether or not the pattern in the image is shot in the upright or the flipped side. Indeed, the output of this code is one of the two possible poses that can be obtained. A possible solution to this issue is to choose a pattern where an extra dot would be located on the upper or lower side of the pattern. Another limitation to this algorithm is that the code cannot calculate the camera to pipe pose unless the dots are detected, which implies that the camera should be relatively close to the pipe in order to obtain the required results. This could also be solved with a smart choice of pattern. For instance, the pattern could contain extra dots with different colour (e.g. red) so as to not hinder the black dots detection algorithm. These extra dots would also be in a bigger size, which would allow detection from a further distance between camera and pipe. Hence, the red dots would be used for relatively far positions, and when the camera gets closer, the search for the smaller black dots begins. END
  • 10. Maher Nadar Computer Vision Final Project 12/06/2106 10 REFERENCES V. Lepetit, F. Moreno-Noguer and P. Fua. EPnP: An Accurate O(n) Solution to the PnP Problem, in International Journal Of Computer Vision, vol. 81, p. 155-166, 2009. Jensen, Jeppe. "Hough Transform for Straight Lines" (PDF). Retrieved 16 December 2011. Gao, Xiao-Shan; Hou, Xiao-Rong; Tang, Jianliang; Cheng, Hang-Fei (2003). "Complete Solution Classification for the Perspective-Three-Point Problem". IEEE Transactions on Pattern Analysis and Machine Intelligence 25 (8): 930–943 Penny. (2009, 09). Question from Wayne. Retrieved from http://mathcentral.uregina.ca/: http://mathcentral.uregina.ca/QQ/database/QQ.09.09/h/wayne1.html