SlideShare a Scribd company logo
Real-time Finger Detection
Chin Huan Tan
February 2019
1 Introduction
Finger detection is an interesting topic to explore in image processing, especially
when it is applied in human-computer interaction. In this article, I’m going to
explain the way to detect the number of fingers in the video captured by a
laptop camera.
2 Overview
Basically, the first task is to detect hand in the video frame. This is the most
challenging part. The proposed way is to use Background Subtraction and HSV
Segmentation together to create a mask. After the hand is segmented, we will
detect the number of fingers raised. There are 2 proposed methods. The first is
to find the largest contour in the image which is assumed to be the hand. Then,
we will find the convex hull and convexity defects which are most probably the
space between fingers. This is a manual way of finding the number of fingers.
The second way is to use a convolutional neural network with the mask as input
to determine the number of fingers. Here is the link to the source code.
3 Hand detection
The most challenging part is to detect the hand in an image. There are many
approaches published. For example, Background Subtraction by lzane[1], HSV
Segmentation by Amar Prakash Pandey[2], detecting using Haar Cascade and
neural network. However, we will only talk about background subtraction and
HSV segmentation in this article.
3.1 Background Subtraction
For the background subtraction to work, we need to have a background image
(without the hand) first. To find the hand, we can subtract the image with
hand from the background. By using OpenCV, this is quite easy to implement.
1
Note that the code is partially given here for explanation. For the fully
functional program, go to the source code here. First, we create a background
subtractor when the background is clear (without hand).
1 ”””
2 @arg h i s t o r y =10: The length of the h i s t o r y
3 @arg varThreshold =30: The threshold to decide whether a p i x e l i s
well described by the background model
4 @arg detectShadows=False : The algorithm w i l l ignore shadows . I f
true , the algorithm w i l l detect shadows and mark them in gray
5 ”””
6 bgSubtractor = cv2 . createBackgroundSubtractorMOG2 ( h i s t o r y =10,
varThreshold =30, detectShadows=False )
After the background subtractor is created, we can apply the background
subtraction to every video frame to create a mask.
1 def bgSubMasking ( s e l f , frame ) :
2 ””” Create a foreground ( hand ) mask
3 @param frame : The video frame
4 @return : A masked frame
5 ”””
6 fgmask = bgSubtractor . apply ( frame , learningRate =0)
7
8 # MORPH OPEN c o n s i s t s of e r o s i o n of the o b j e c t s followed by
d i l a t i o n .
9 # The e f f e c t i s to remove the noise in the background
10 # MORPH CLOSE c o n s i s t s of d i l a t i o n of the o b j e c t s followed by
e r o s i o n .
11 # The e f f e c t i s to c l o s e the holes in the o b j e c t s
12 kernel = np . ones ((4 , 4) , np . uint8 )
13 fgmask = cv2 . morphologyEx ( fgmask , cv2 .MORPH OPEN, kernel ,
i t e r a t i o n s =2)
14 fgmask = cv2 . morphologyEx ( fgmask , cv2 .MORPH CLOSE, kernel ,
i t e r a t i o n s =2)
15
16 # Apply the mask on the frame and return
17 return cv2 . bitwise and ( frame , frame , mask=fgmask )
Here is the result after the background subtraction is applied.
Note that the background is masked away.
2
However, there is a major problem here. The background subtraction alone
will capture other moving objects in the video frames. Hence, we introduce
another method.
3.2 HSV Segmentation
In HSV (Hue, Saturation, Value) segmentation, the idea is to segment the hand
based on the color. At first, we will sample the color of the hand. Then, we
detect. Usually, a pixel in a frame or an image is represented as RGB (Red,
Green, Blue). The reason we use HSV rather than RGB because RGB contains
the information on the brightness of the color. Therefore, when we sample the
color of the hand, we sample the brightness as well. This is an issue when we
detect the hand because the hand has to be under the same brightness in order
to be detected. The brightness of a color is encoded in the Value (V) in the
HSV. Hence, when we sample the color of the hand, we sample only the Hue
(H) and Saturation (S).
Based on the technique by Amar[2], we will place our hand at a location to
take some samples of the hand color. By using the pixels, we form a histogram
to represent the frequency of each color appears in the sample. This forms a
probability distribution of the colors. By normalizing the histogram, we can
find the probability of each color being a part of the hand.
1 ”””
2 @arg [ r o i ] : The region of i n t e r e s t which i s the 9 squares
3 @arg [ 0 , 1 ] : Means to take the channels , Hue and Saturation ,
ignoring the third channel , Value
4 @arg [ 0 , 180 , 0 , 2 5 6 ] : The range of values of Hue i s from 0−179
whereas the range of values of Saturation i s from 0−255
5 ”””
6 handHist = cv2 . c a l c H i s t ( [ r o i ] , [ 0 , 1 ] , None , [180 , 256] , [ 0 , 180 ,
0 , 256])
7 handHist = cv2 . normalize ( handHist , handHist , 0 , 255 , cv2 .
NORMMINMAX)
After we have created the normalized histogram of colors of the hand. We
can now create the HSV mask. The mask is actually a map of probability. Each
pixel contains the probability of that pixel being a part of the hand.
1 def histMasking ( frame , handHist ) :
2 ””” Create the HSV masking
3 @param frame : The video frame
4 @param handHist : The histogram generated
5 @return : A masked frame
6 ”””
7 hsv = cv2 . cvtColor ( frame , cv2 .COLOR BGR2HSV)
8 dst = cv2 . calcBackProject ( [ hsv ] , [ 0 , 1 ] , handHist , [ 0 , 180 , 0 ,
256] , 1)
9
10 d i s c = cv2 . getStructuringElement ( cv2 .MORPH ELLIPSE, (21 , 21) )
11 cv2 . f i l t e r 2 D ( dst , −1, disc , dst )
12 # dst i s now a p r o b a b i l i t y map
13
3
14 # Use binary thresholding to create a map of 0 s and 1 s
15 # 1 means the p i x e l i s part of the hand and 0 means not
16 ret , thresh = cv2 . threshold ( dst , 150 , 255 , cv2 .THRESH BINARY)
17
18 kernel = np . ones ((5 , 5) , np . uint8 )
19 thresh = cv2 . morphologyEx ( thresh , cv2 .MORPH CLOSE, kernel ,
i t e r a t i o n s =7)
20
21 thresh = cv2 . merge (( thresh , thresh , thresh ) )
22 return cv2 . bitwise and ( frame , thresh )
Below is the result after the HSV segmentation.
The downside of this segmentation is that skin color will be detected but
we want only the hand. Therefore, we will use ”bitwise and” operation on the
foreground mask and the HSV mask. The result will be our final mask.
1 histMask = histMasking ( roi , handHist )
2 bgSubMask = bgSubMasking ( r o i )
3 mask = cv2 . bitwise and ( histMask , bgSubMask)
4 Fingers Counting
After we have gotten the mask, we can now count the number of fingers. We
have 2 methods. One is to do it manually by finding convexity defects. Another
one is using a convolutional neural network.
4
4.1 Manual Method
Green: Contour
Red: Convex hull
Blue: Convexity defect
After the hand segmentation, the mask should contain only the hand. There-
fore, in the manual method, we will start by finding the largest contour which
is assumed to be the hand.
1 def threshold (mask) :
2 ””” Thresholding into a binary mask”””
3 grayMask = cv2 . cvtColor (mask , cv2 .COLOR BGR2GRAY)
4 ret , thresh = cv2 . threshold ( grayMask , 0 , 255 , 0)
5 return thresh
6
7 def getMaxContours ( contours ) :
8 ””” Find the l a r g e s t contour ”””
9 maxIndex = 0
10 maxArea = 0
11 f o r i in range ( len ( contours ) ) :
12 cnt = contours [ i ]
13 area = cv2 . contourArea ( cnt )
14 i f area > maxArea :
15 maxArea = area
16 maxIndex = i
17 return contours [ maxIndex ]
18
19 thresh = threshold (mask)
20 , contours , hierarchy = cv2 . findContours ( thresh , cv2 .RETR TREE,
cv2 .CHAIN APPROX SIMPLE)
21
22 # There might be no contour when hand i s not i n s i d e the frame
23 i f len ( contours ) > 0:
24 maxContour = getMaxContours ( contours )
After finding the largest contour, we will find its convex hull. The convex hull
is simply a curve covering the contour. From the convex hull, we can find the
convexity defects. Convexity defects are the place where the curve are bulged
inside. These are assumed to be the spaces between the fingers. We will use
this to determine the number of fingers.
1 def countFingers ( contour ) :
5
2 h u l l = cv2 . convexHull ( contour , returnPoints=False )
3 i f len ( h u l l ) > 3:
4 d e f e c t s = cv2 . convexityDefects ( contour , h u l l )
5 cnt = 0
6 i f type ( d e f e c t s ) != type (None) :
7 f o r i in range ( d e f e c t s . shape [ 0 ] ) :
8 s , e , f , d = d e f e c t s [ i , 0]
9 s t a r t = tuple ( contour [ s , 0 ] )
10 end = tuple ( contour [ e , 0 ] )
11 f a r = tuple ( contour [ f , 0 ] )
12 angle = calculateAngle ( far , start , end )
13
14 # Ignore the d e f e c t s which are small and wide
15 # Probably not f i n g e r s
16 i f d > 10000 and angle <= math . pi /2:
17 cnt += 1
18 return True , cnt
19 return False , 0
20
21 def calculateAngle ( far , start , end ) :
22 ””” Cosine r u l e ”””
23 a = math . sqrt (( end [ 0 ] − s t a r t [ 0 ] ) ∗∗2 + ( end [ 1 ] − s t a r t [ 1 ] ) ∗∗2)
24 b = math . sqrt (( f a r [ 0 ] − s t a r t [ 0 ] ) ∗∗2 + ( f a r [ 1 ] − s t a r t [ 1 ] ) ∗∗2)
25 c = math . sqrt (( end [ 0 ] − f a r [ 0 ] ) ∗∗2 + ( end [ 1 ] − f a r [ 1 ] ) ∗∗2)
26 angle = math . acos (( b∗∗2 + c ∗∗2 − a ∗∗2) / (2∗b∗c ) )
27 return angle
When counting the convexity defects, we have to impose some limitations.
We do not want to take all the convexity defects especially when there is a
distortion on the contour. The limitations include the depth of the defects
has to be larger than a certain value (10000 in the example above). We exclude
small defects which are probably not the fingers. Besides, we exclude the defects
which are wider than 90 degrees. We calculate the angle by using cosine rule.
The output will be the number of convexity defects. For example, if the
number of convexity defects is two, then the number of fingers raised is three.
However, using only these, we cannot differentiate between no finger raised and
one finger raised. This can be solved by calculating the distance between the
centroid of the contour and the highest point of the contour. If it is larger than
a certain distance, the number of fingers raised is one, else no finger raised.
4.2 Convolutional Neural Network
Using a Convolutional Neural Network (CNN) actually simplifies a lot of work.
Keras in python is a good option. It is relatively simple. Due to limited GPU
memory, I have resized the video frame from 260 × 260 to 28 × 28. Feel free to
try giving the original size as the input to the CNN. Below is how I construct
my CNN model.
1 model = Sequential ()
2 model . add (Conv2D(32 , (3 ,3) , a c t i v a t i o n =’ relu ’ , input shape =(28 , 28 ,
1) ) )
3 model . add (Conv2D(64 , (3 ,3) , a c t i v a t i o n =’ relu ’ ) )
6
4 model . add ( MaxPooling2D ( ( 2 , 2 ) ) )
5 model . add ( Dropout ( 0 . 2 5 ) )
6 model . add ( Flatten () )
7 model . add ( Dense (128 , a c t i v a t i o n =’ relu ’ ) )
8 model . add ( Dropout ( 0 . 5 ) )
9 model . add ( Dense (6 , a c t i v a t i o n =’softmax ’ ) )
I have trained my model using approximately 1000 images per class and
200 images for testing. I have done some rotation, shifting and flipping on the
training images. For more details, check out the source code.
I try to balance the number of training images to prevent a bias in the model.
7
Here are the samples of my training data.
The result looks pretty good. It achieves validation accurracy of 99% at the
fifth epoch. However, the model is trained and tested on my own hand. It might
not generalize to other people’s hand. Therefore, I’m not posting my model.
8
Finally, we can load the model and predict the result.
1 from keras . models import load model
2
3 model = load model (” model 1 . h5 ”)
4
5 modelInput = cv2 . r e s i z e ( thresh , (28 , 28) )
6 modelInput = np . expand dims ( modelInput , axis=−1)
7 modelInput = np . expand dims ( modelInput , axis =0)
8
9 pred = s e l f . model . p r e d i c t ( modelInput )
10 pred = np . argmax ( pred [ 0 ] )
9
5 Conclusion
Using the result from detection, we can use it as a command to interact with
the computers (I have done key pressing in the source code). Of course, you
can do more than that. However, there are still many improvements needed to
make this application practical. Feel free to improve it.
References
[1] lzane/Fingers-Detection-using-OpenCV-and-Python. Retrieved from
https://github.com/lzane/Fingers-Detection-using-OpenCV-and-Python
[2] amarlearning/Finger-Detection-and-Tracking. Retrieved from
https://github.com/amarlearning/Finger-Detection-and-Tracking
10

More Related Content

PPTX
Sixth Sense Technology
PPTX
Sixth sense technology ppt
PPT
Digital jewellery ppt
PPTX
MINI PROJECT PPT
POTX
Rover Technology
PPTX
SMART CAR-PARKING SYSTEM USING IOT
PPTX
Vehicle number plate recognition using matlab
PPTX
Touchless touchscreen technology
Sixth Sense Technology
Sixth sense technology ppt
Digital jewellery ppt
MINI PROJECT PPT
Rover Technology
SMART CAR-PARKING SYSTEM USING IOT
Vehicle number plate recognition using matlab
Touchless touchscreen technology

What's hot (20)

ODP
Android sensors
DOCX
Vehicle Tracking System by Arduino UNO
PDF
Mobile Theft Tracking Application
PPTX
Automatic number plate recognition (anpr)
PPTX
Gesture Recognition
PPTX
seminar on invisible eye
PPTX
Graphical password authentication
PDF
Smart Helmet Alcohol Detection and Sleep Alert
PPTX
Gesture Recognition Technology-Seminar PPT
PPTX
HAND GESTURE CONTROLLED WHEEL CHAIR
PPTX
Sixth sense technology
PPTX
Wireless LAN Security
DOC
My Project Report Documentation with Abstract & Snapshots
PPTX
Sixth sense technology
PPTX
Digital jewellery
PPTX
wireless sensor network ppt
PPTX
PPTX
Wlan architecture
PPT
Wireless LAN security
PDF
Chat Application [Full Documentation]
Android sensors
Vehicle Tracking System by Arduino UNO
Mobile Theft Tracking Application
Automatic number plate recognition (anpr)
Gesture Recognition
seminar on invisible eye
Graphical password authentication
Smart Helmet Alcohol Detection and Sleep Alert
Gesture Recognition Technology-Seminar PPT
HAND GESTURE CONTROLLED WHEEL CHAIR
Sixth sense technology
Wireless LAN Security
My Project Report Documentation with Abstract & Snapshots
Sixth sense technology
Digital jewellery
wireless sensor network ppt
Wlan architecture
Wireless LAN security
Chat Application [Full Documentation]
Ad

Similar to Finger detection (20)

PDF
Gesture Recognition Based Video Game Controller
PPTX
Major_Project_Group 03_KIIT-1.pptx
PDF
A Pointing Gesture-based Signal to Text Communication System Using OpenCV in ...
PPTX
DeepFake Teori and Implementation Ho When What Best Practice
PDF
IRJET-Real Time Hand Gesture Recognition using Finger Tips
PPTX
Gesture Recognition?
PDF
How to Make Hand Detector on Native Activity with OpenCV
PDF
Android Application for American Sign Language Recognition
PPTX
HandGestureRecognitionUsingOpenCV
PDF
CE344L-200365-Lab5.pdf
PDF
Cc4301455457
PDF
JonathanWestlake_ComputerVision_Project2
PDF
JonathanWestlake_ComputerVision_Project1
PPTX
Face recognition system
PPTX
05 contours seg_matching
PPTX
Complex Weld Seam Detection Using Computer Vision Linked In
PPT
biometrics.ppt
PPTX
PYTHON-OOOOOOOOOOPPPPPPEEEEEEEEN-CV.pptx
PPTX
PYTHON-OPEEEEEEEEEEEEEEN-CV (1) kgjkg.pptx
PDF
BMVA summer school MATLAB programming tutorial
Gesture Recognition Based Video Game Controller
Major_Project_Group 03_KIIT-1.pptx
A Pointing Gesture-based Signal to Text Communication System Using OpenCV in ...
DeepFake Teori and Implementation Ho When What Best Practice
IRJET-Real Time Hand Gesture Recognition using Finger Tips
Gesture Recognition?
How to Make Hand Detector on Native Activity with OpenCV
Android Application for American Sign Language Recognition
HandGestureRecognitionUsingOpenCV
CE344L-200365-Lab5.pdf
Cc4301455457
JonathanWestlake_ComputerVision_Project2
JonathanWestlake_ComputerVision_Project1
Face recognition system
05 contours seg_matching
Complex Weld Seam Detection Using Computer Vision Linked In
biometrics.ppt
PYTHON-OOOOOOOOOOPPPPPPEEEEEEEEN-CV.pptx
PYTHON-OPEEEEEEEEEEEEEEN-CV (1) kgjkg.pptx
BMVA summer school MATLAB programming tutorial
Ad

Recently uploaded (20)

PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Heart disease approach using modified random forest and particle swarm optimi...
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Enhancing emotion recognition model for a student engagement use case through...
PDF
1 - Historical Antecedents, Social Consideration.pdf
PDF
Approach and Philosophy of On baking technology
PDF
project resource management chapter-09.pdf
PDF
DP Operators-handbook-extract for the Mautical Institute
PPTX
cloud_computing_Infrastucture_as_cloud_p
PPTX
A Presentation on Artificial Intelligence
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PDF
Zenith AI: Advanced Artificial Intelligence
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
1. Introduction to Computer Programming.pptx
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PPTX
TLE Review Electricity (Electricity).pptx
PPTX
SOPHOS-XG Firewall Administrator PPT.pptx
PDF
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
Group 1 Presentation -Planning and Decision Making .pptx
Digital-Transformation-Roadmap-for-Companies.pptx
Heart disease approach using modified random forest and particle swarm optimi...
Unlocking AI with Model Context Protocol (MCP)
Enhancing emotion recognition model for a student engagement use case through...
1 - Historical Antecedents, Social Consideration.pdf
Approach and Philosophy of On baking technology
project resource management chapter-09.pdf
DP Operators-handbook-extract for the Mautical Institute
cloud_computing_Infrastucture_as_cloud_p
A Presentation on Artificial Intelligence
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
Zenith AI: Advanced Artificial Intelligence
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Building Integrated photovoltaic BIPV_UPV.pdf
1. Introduction to Computer Programming.pptx
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
TLE Review Electricity (Electricity).pptx
SOPHOS-XG Firewall Administrator PPT.pptx
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf

Finger detection

  • 1. Real-time Finger Detection Chin Huan Tan February 2019 1 Introduction Finger detection is an interesting topic to explore in image processing, especially when it is applied in human-computer interaction. In this article, I’m going to explain the way to detect the number of fingers in the video captured by a laptop camera. 2 Overview Basically, the first task is to detect hand in the video frame. This is the most challenging part. The proposed way is to use Background Subtraction and HSV Segmentation together to create a mask. After the hand is segmented, we will detect the number of fingers raised. There are 2 proposed methods. The first is to find the largest contour in the image which is assumed to be the hand. Then, we will find the convex hull and convexity defects which are most probably the space between fingers. This is a manual way of finding the number of fingers. The second way is to use a convolutional neural network with the mask as input to determine the number of fingers. Here is the link to the source code. 3 Hand detection The most challenging part is to detect the hand in an image. There are many approaches published. For example, Background Subtraction by lzane[1], HSV Segmentation by Amar Prakash Pandey[2], detecting using Haar Cascade and neural network. However, we will only talk about background subtraction and HSV segmentation in this article. 3.1 Background Subtraction For the background subtraction to work, we need to have a background image (without the hand) first. To find the hand, we can subtract the image with hand from the background. By using OpenCV, this is quite easy to implement. 1
  • 2. Note that the code is partially given here for explanation. For the fully functional program, go to the source code here. First, we create a background subtractor when the background is clear (without hand). 1 ””” 2 @arg h i s t o r y =10: The length of the h i s t o r y 3 @arg varThreshold =30: The threshold to decide whether a p i x e l i s well described by the background model 4 @arg detectShadows=False : The algorithm w i l l ignore shadows . I f true , the algorithm w i l l detect shadows and mark them in gray 5 ””” 6 bgSubtractor = cv2 . createBackgroundSubtractorMOG2 ( h i s t o r y =10, varThreshold =30, detectShadows=False ) After the background subtractor is created, we can apply the background subtraction to every video frame to create a mask. 1 def bgSubMasking ( s e l f , frame ) : 2 ””” Create a foreground ( hand ) mask 3 @param frame : The video frame 4 @return : A masked frame 5 ””” 6 fgmask = bgSubtractor . apply ( frame , learningRate =0) 7 8 # MORPH OPEN c o n s i s t s of e r o s i o n of the o b j e c t s followed by d i l a t i o n . 9 # The e f f e c t i s to remove the noise in the background 10 # MORPH CLOSE c o n s i s t s of d i l a t i o n of the o b j e c t s followed by e r o s i o n . 11 # The e f f e c t i s to c l o s e the holes in the o b j e c t s 12 kernel = np . ones ((4 , 4) , np . uint8 ) 13 fgmask = cv2 . morphologyEx ( fgmask , cv2 .MORPH OPEN, kernel , i t e r a t i o n s =2) 14 fgmask = cv2 . morphologyEx ( fgmask , cv2 .MORPH CLOSE, kernel , i t e r a t i o n s =2) 15 16 # Apply the mask on the frame and return 17 return cv2 . bitwise and ( frame , frame , mask=fgmask ) Here is the result after the background subtraction is applied. Note that the background is masked away. 2
  • 3. However, there is a major problem here. The background subtraction alone will capture other moving objects in the video frames. Hence, we introduce another method. 3.2 HSV Segmentation In HSV (Hue, Saturation, Value) segmentation, the idea is to segment the hand based on the color. At first, we will sample the color of the hand. Then, we detect. Usually, a pixel in a frame or an image is represented as RGB (Red, Green, Blue). The reason we use HSV rather than RGB because RGB contains the information on the brightness of the color. Therefore, when we sample the color of the hand, we sample the brightness as well. This is an issue when we detect the hand because the hand has to be under the same brightness in order to be detected. The brightness of a color is encoded in the Value (V) in the HSV. Hence, when we sample the color of the hand, we sample only the Hue (H) and Saturation (S). Based on the technique by Amar[2], we will place our hand at a location to take some samples of the hand color. By using the pixels, we form a histogram to represent the frequency of each color appears in the sample. This forms a probability distribution of the colors. By normalizing the histogram, we can find the probability of each color being a part of the hand. 1 ””” 2 @arg [ r o i ] : The region of i n t e r e s t which i s the 9 squares 3 @arg [ 0 , 1 ] : Means to take the channels , Hue and Saturation , ignoring the third channel , Value 4 @arg [ 0 , 180 , 0 , 2 5 6 ] : The range of values of Hue i s from 0−179 whereas the range of values of Saturation i s from 0−255 5 ””” 6 handHist = cv2 . c a l c H i s t ( [ r o i ] , [ 0 , 1 ] , None , [180 , 256] , [ 0 , 180 , 0 , 256]) 7 handHist = cv2 . normalize ( handHist , handHist , 0 , 255 , cv2 . NORMMINMAX) After we have created the normalized histogram of colors of the hand. We can now create the HSV mask. The mask is actually a map of probability. Each pixel contains the probability of that pixel being a part of the hand. 1 def histMasking ( frame , handHist ) : 2 ””” Create the HSV masking 3 @param frame : The video frame 4 @param handHist : The histogram generated 5 @return : A masked frame 6 ””” 7 hsv = cv2 . cvtColor ( frame , cv2 .COLOR BGR2HSV) 8 dst = cv2 . calcBackProject ( [ hsv ] , [ 0 , 1 ] , handHist , [ 0 , 180 , 0 , 256] , 1) 9 10 d i s c = cv2 . getStructuringElement ( cv2 .MORPH ELLIPSE, (21 , 21) ) 11 cv2 . f i l t e r 2 D ( dst , −1, disc , dst ) 12 # dst i s now a p r o b a b i l i t y map 13 3
  • 4. 14 # Use binary thresholding to create a map of 0 s and 1 s 15 # 1 means the p i x e l i s part of the hand and 0 means not 16 ret , thresh = cv2 . threshold ( dst , 150 , 255 , cv2 .THRESH BINARY) 17 18 kernel = np . ones ((5 , 5) , np . uint8 ) 19 thresh = cv2 . morphologyEx ( thresh , cv2 .MORPH CLOSE, kernel , i t e r a t i o n s =7) 20 21 thresh = cv2 . merge (( thresh , thresh , thresh ) ) 22 return cv2 . bitwise and ( frame , thresh ) Below is the result after the HSV segmentation. The downside of this segmentation is that skin color will be detected but we want only the hand. Therefore, we will use ”bitwise and” operation on the foreground mask and the HSV mask. The result will be our final mask. 1 histMask = histMasking ( roi , handHist ) 2 bgSubMask = bgSubMasking ( r o i ) 3 mask = cv2 . bitwise and ( histMask , bgSubMask) 4 Fingers Counting After we have gotten the mask, we can now count the number of fingers. We have 2 methods. One is to do it manually by finding convexity defects. Another one is using a convolutional neural network. 4
  • 5. 4.1 Manual Method Green: Contour Red: Convex hull Blue: Convexity defect After the hand segmentation, the mask should contain only the hand. There- fore, in the manual method, we will start by finding the largest contour which is assumed to be the hand. 1 def threshold (mask) : 2 ””” Thresholding into a binary mask””” 3 grayMask = cv2 . cvtColor (mask , cv2 .COLOR BGR2GRAY) 4 ret , thresh = cv2 . threshold ( grayMask , 0 , 255 , 0) 5 return thresh 6 7 def getMaxContours ( contours ) : 8 ””” Find the l a r g e s t contour ””” 9 maxIndex = 0 10 maxArea = 0 11 f o r i in range ( len ( contours ) ) : 12 cnt = contours [ i ] 13 area = cv2 . contourArea ( cnt ) 14 i f area > maxArea : 15 maxArea = area 16 maxIndex = i 17 return contours [ maxIndex ] 18 19 thresh = threshold (mask) 20 , contours , hierarchy = cv2 . findContours ( thresh , cv2 .RETR TREE, cv2 .CHAIN APPROX SIMPLE) 21 22 # There might be no contour when hand i s not i n s i d e the frame 23 i f len ( contours ) > 0: 24 maxContour = getMaxContours ( contours ) After finding the largest contour, we will find its convex hull. The convex hull is simply a curve covering the contour. From the convex hull, we can find the convexity defects. Convexity defects are the place where the curve are bulged inside. These are assumed to be the spaces between the fingers. We will use this to determine the number of fingers. 1 def countFingers ( contour ) : 5
  • 6. 2 h u l l = cv2 . convexHull ( contour , returnPoints=False ) 3 i f len ( h u l l ) > 3: 4 d e f e c t s = cv2 . convexityDefects ( contour , h u l l ) 5 cnt = 0 6 i f type ( d e f e c t s ) != type (None) : 7 f o r i in range ( d e f e c t s . shape [ 0 ] ) : 8 s , e , f , d = d e f e c t s [ i , 0] 9 s t a r t = tuple ( contour [ s , 0 ] ) 10 end = tuple ( contour [ e , 0 ] ) 11 f a r = tuple ( contour [ f , 0 ] ) 12 angle = calculateAngle ( far , start , end ) 13 14 # Ignore the d e f e c t s which are small and wide 15 # Probably not f i n g e r s 16 i f d > 10000 and angle <= math . pi /2: 17 cnt += 1 18 return True , cnt 19 return False , 0 20 21 def calculateAngle ( far , start , end ) : 22 ””” Cosine r u l e ””” 23 a = math . sqrt (( end [ 0 ] − s t a r t [ 0 ] ) ∗∗2 + ( end [ 1 ] − s t a r t [ 1 ] ) ∗∗2) 24 b = math . sqrt (( f a r [ 0 ] − s t a r t [ 0 ] ) ∗∗2 + ( f a r [ 1 ] − s t a r t [ 1 ] ) ∗∗2) 25 c = math . sqrt (( end [ 0 ] − f a r [ 0 ] ) ∗∗2 + ( end [ 1 ] − f a r [ 1 ] ) ∗∗2) 26 angle = math . acos (( b∗∗2 + c ∗∗2 − a ∗∗2) / (2∗b∗c ) ) 27 return angle When counting the convexity defects, we have to impose some limitations. We do not want to take all the convexity defects especially when there is a distortion on the contour. The limitations include the depth of the defects has to be larger than a certain value (10000 in the example above). We exclude small defects which are probably not the fingers. Besides, we exclude the defects which are wider than 90 degrees. We calculate the angle by using cosine rule. The output will be the number of convexity defects. For example, if the number of convexity defects is two, then the number of fingers raised is three. However, using only these, we cannot differentiate between no finger raised and one finger raised. This can be solved by calculating the distance between the centroid of the contour and the highest point of the contour. If it is larger than a certain distance, the number of fingers raised is one, else no finger raised. 4.2 Convolutional Neural Network Using a Convolutional Neural Network (CNN) actually simplifies a lot of work. Keras in python is a good option. It is relatively simple. Due to limited GPU memory, I have resized the video frame from 260 × 260 to 28 × 28. Feel free to try giving the original size as the input to the CNN. Below is how I construct my CNN model. 1 model = Sequential () 2 model . add (Conv2D(32 , (3 ,3) , a c t i v a t i o n =’ relu ’ , input shape =(28 , 28 , 1) ) ) 3 model . add (Conv2D(64 , (3 ,3) , a c t i v a t i o n =’ relu ’ ) ) 6
  • 7. 4 model . add ( MaxPooling2D ( ( 2 , 2 ) ) ) 5 model . add ( Dropout ( 0 . 2 5 ) ) 6 model . add ( Flatten () ) 7 model . add ( Dense (128 , a c t i v a t i o n =’ relu ’ ) ) 8 model . add ( Dropout ( 0 . 5 ) ) 9 model . add ( Dense (6 , a c t i v a t i o n =’softmax ’ ) ) I have trained my model using approximately 1000 images per class and 200 images for testing. I have done some rotation, shifting and flipping on the training images. For more details, check out the source code. I try to balance the number of training images to prevent a bias in the model. 7
  • 8. Here are the samples of my training data. The result looks pretty good. It achieves validation accurracy of 99% at the fifth epoch. However, the model is trained and tested on my own hand. It might not generalize to other people’s hand. Therefore, I’m not posting my model. 8
  • 9. Finally, we can load the model and predict the result. 1 from keras . models import load model 2 3 model = load model (” model 1 . h5 ”) 4 5 modelInput = cv2 . r e s i z e ( thresh , (28 , 28) ) 6 modelInput = np . expand dims ( modelInput , axis=−1) 7 modelInput = np . expand dims ( modelInput , axis =0) 8 9 pred = s e l f . model . p r e d i c t ( modelInput ) 10 pred = np . argmax ( pred [ 0 ] ) 9
  • 10. 5 Conclusion Using the result from detection, we can use it as a command to interact with the computers (I have done key pressing in the source code). Of course, you can do more than that. However, there are still many improvements needed to make this application practical. Feel free to improve it. References [1] lzane/Fingers-Detection-using-OpenCV-and-Python. Retrieved from https://github.com/lzane/Fingers-Detection-using-OpenCV-and-Python [2] amarlearning/Finger-Detection-and-Tracking. Retrieved from https://github.com/amarlearning/Finger-Detection-and-Tracking 10