Finger detection

Real-time Finger Detection
Chin Huan Tan
February 2019
1 Introduction
Finger detection is an interesting topic to explore in image processing, especially
when it is applied in human-computer interaction. In this article, I’m going to
explain the way to detect the number of fingers in the video captured by a
laptop camera.
2 Overview
Basically, the first task is to detect hand in the video frame. This is the most
challenging part. The proposed way is to use Background Subtraction and HSV
Segmentation together to create a mask. After the hand is segmented, we will
detect the number of fingers raised. There are 2 proposed methods. The first is
to find the largest contour in the image which is assumed to be the hand. Then,
we will find the convex hull and convexity defects which are most probably the
space between fingers. This is a manual way of finding the number of fingers.
The second way is to use a convolutional neural network with the mask as input
to determine the number of fingers. Here is the link to the source code.
3 Hand detection
The most challenging part is to detect the hand in an image. There are many
approaches published. For example, Background Subtraction by lzane[1], HSV
Segmentation by Amar Prakash Pandey[2], detecting using Haar Cascade and
neural network. However, we will only talk about background subtraction and
HSV segmentation in this article.
3.1 Background Subtraction
For the background subtraction to work, we need to have a background image
(without the hand) first. To find the hand, we can subtract the image with
hand from the background. By using OpenCV, this is quite easy to implement.
1

Note that the code is partially given here for explanation. For the fully
functional program, go to the source code here. First, we create a background
subtractor when the background is clear (without hand).
1 ”””
2 @arg h i s t o r y =10: The length of the h i s t o r y
3 @arg varThreshold =30: The threshold to decide whether a p i x e l i s
well described by the background model
4 @arg detectShadows=False : The algorithm w i l l ignore shadows . I f
true , the algorithm w i l l detect shadows and mark them in gray
5 ”””
6 bgSubtractor = cv2 . createBackgroundSubtractorMOG2 ( h i s t o r y =10,
varThreshold =30, detectShadows=False )
After the background subtractor is created, we can apply the background
subtraction to every video frame to create a mask.
1 def bgSubMasking ( s e l f , frame ) :
2 ””” Create a foreground ( hand ) mask
3 @param frame : The video frame
4 @return : A masked frame
5 ”””
6 fgmask = bgSubtractor . apply ( frame , learningRate =0)
7
8 # MORPH OPEN c o n s i s t s of e r o s i o n of the o b j e c t s followed by
d i l a t i o n .
9 # The e f f e c t i s to remove the noise in the background
10 # MORPH CLOSE c o n s i s t s of d i l a t i o n of the o b j e c t s followed by
e r o s i o n .
11 # The e f f e c t i s to c l o s e the holes in the o b j e c t s
12 kernel = np . ones ((4 , 4) , np . uint8 )
13 fgmask = cv2 . morphologyEx ( fgmask , cv2 .MORPH OPEN, kernel ,
i t e r a t i o n s =2)
14 fgmask = cv2 . morphologyEx ( fgmask , cv2 .MORPH CLOSE, kernel ,
15
16 # Apply the mask on the frame and return
17 return cv2 . bitwise and ( frame , frame , mask=fgmask )
Here is the result after the background subtraction is applied.
Note that the background is masked away.
2

However, there is a major problem here. The background subtraction alone
will capture other moving objects in the video frames. Hence, we introduce
another method.
3.2 HSV Segmentation
In HSV (Hue, Saturation, Value) segmentation, the idea is to segment the hand
based on the color. At ﬁrst, we will sample the color of the hand. Then, we
detect. Usually, a pixel in a frame or an image is represented as RGB (Red,
Green, Blue). The reason we use HSV rather than RGB because RGB contains
the information on the brightness of the color. Therefore, when we sample the
color of the hand, we sample the brightness as well. This is an issue when we
detect the hand because the hand has to be under the same brightness in order
to be detected. The brightness of a color is encoded in the Value (V) in the
HSV. Hence, when we sample the color of the hand, we sample only the Hue
(H) and Saturation (S).
Based on the technique by Amar[2], we will place our hand at a location to
take some samples of the hand color. By using the pixels, we form a histogram
to represent the frequency of each color appears in the sample. This forms a
probability distribution of the colors. By normalizing the histogram, we can
ﬁnd the probability of each color being a part of the hand.
1 ”””
2 @arg [ r o i ] : The region of i n t e r e s t which i s the 9 squares
3 @arg [ 0 , 1 ] : Means to take the channels , Hue and Saturation ,
ignoring the third channel , Value
4 @arg [ 0 , 180 , 0 , 2 5 6 ] : The range of values of Hue i s from 0−179
whereas the range of values of Saturation i s from 0−255
5 ”””
6 handHist = cv2 . c a l c H i s t ( [ r o i ] , [ 0 , 1 ] , None , [180 , 256] , [ 0 , 180 ,
0 , 256])
7 handHist = cv2 . normalize ( handHist , handHist , 0 , 255 , cv2 .
NORMMINMAX)
After we have created the normalized histogram of colors of the hand. We
can now create the HSV mask. The mask is actually a map of probability. Each
pixel contains the probability of that pixel being a part of the hand.
1 def histMasking ( frame , handHist ) :
2 ””” Create the HSV masking
3 @param frame : The video frame
4 @param handHist : The histogram generated
5 @return : A masked frame
6 ”””
7 hsv = cv2 . cvtColor ( frame , cv2 .COLOR BGR2HSV)
8 dst = cv2 . calcBackProject ( [ hsv ] , [ 0 , 1 ] , handHist , [ 0 , 180 , 0 ,
256] , 1)
9
10 d i s c = cv2 . getStructuringElement ( cv2 .MORPH ELLIPSE, (21 , 21) )
11 cv2 . f i l t e r 2 D ( dst , −1, disc , dst )
12 # dst i s now a p r o b a b i l i t y map
13
3

14 # Use binary thresholding to create a map of 0 s and 1 s
15 # 1 means the p i x e l i s part of the hand and 0 means not
16 ret , thresh = cv2 . threshold ( dst , 150 , 255 , cv2 .THRESH BINARY)
17
18 kernel = np . ones ((5 , 5) , np . uint8 )
19 thresh = cv2 . morphologyEx ( thresh , cv2 .MORPH CLOSE, kernel ,
20
21 thresh = cv2 . merge (( thresh , thresh , thresh ) )
22 return cv2 . bitwise and ( frame , thresh )
Below is the result after the HSV segmentation.
The downside of this segmentation is that skin color will be detected but
we want only the hand. Therefore, we will use ”bitwise and” operation on the
foreground mask and the HSV mask. The result will be our final mask.
1 histMask = histMasking ( roi , handHist )
2 bgSubMask = bgSubMasking ( r o i )
3 mask = cv2 . bitwise and ( histMask , bgSubMask)
4 Fingers Counting
After we have gotten the mask, we can now count the number of fingers. We
have 2 methods. One is to do it manually by finding convexity defects. Another
one is using a convolutional neural network.
4

4.1 Manual Method
Green: Contour
Red: Convex hull
Blue: Convexity defect
After the hand segmentation, the mask should contain only the hand. There-
fore, in the manual method, we will start by finding the largest contour which
is assumed to be the hand.
1 def threshold (mask) :
2 ””” Thresholding into a binary mask”””
3 grayMask = cv2 . cvtColor (mask , cv2 .COLOR BGR2GRAY)
4 ret , thresh = cv2 . threshold ( grayMask , 0 , 255 , 0)
5 return thresh
6
7 def getMaxContours ( contours ) :
8 ””” Find the l a r g e s t contour ”””
9 maxIndex = 0
10 maxArea = 0
11 f o r i in range ( len ( contours ) ) :
12 cnt = contours [ i ]
13 area = cv2 . contourArea ( cnt )
14 i f area > maxArea :
15 maxArea = area
16 maxIndex = i
17 return contours [ maxIndex ]
18
19 thresh = threshold (mask)
20 , contours , hierarchy = cv2 . findContours ( thresh , cv2 .RETR TREE,
cv2 .CHAIN APPROX SIMPLE)
21
22 # There might be no contour when hand i s not i n s i d e the frame
23 i f len ( contours ) > 0:
24 maxContour = getMaxContours ( contours )
After finding the largest contour, we will find its convex hull. The convex hull
is simply a curve covering the contour. From the convex hull, we can find the
convexity defects. Convexity defects are the place where the curve are bulged
inside. These are assumed to be the spaces between the fingers. We will use
this to determine the number of fingers.
1 def countFingers ( contour ) :
5

2 h u l l = cv2 . convexHull ( contour , returnPoints=False )
3 i f len ( h u l l ) > 3:
4 d e f e c t s = cv2 . convexityDefects ( contour , h u l l )
5 cnt = 0
6 i f type ( d e f e c t s ) != type (None) :
7 f o r i in range ( d e f e c t s . shape [ 0 ] ) :
8 s , e , f , d = d e f e c t s [ i , 0]
9 s t a r t = tuple ( contour [ s , 0 ] )
10 end = tuple ( contour [ e , 0 ] )
11 f a r = tuple ( contour [ f , 0 ] )
12 angle = calculateAngle ( far , start , end )
13
14 # Ignore the d e f e c t s which are small and wide
15 # Probably not f i n g e r s
16 i f d > 10000 and angle <= math . pi /2:
17 cnt += 1
18 return True , cnt
19 return False , 0
20
21 def calculateAngle ( far , start , end ) :
22 ””” Cosine r u l e ”””
23 a = math . sqrt (( end [ 0 ] − s t a r t [ 0 ] ) ∗∗2 + ( end [ 1 ] − s t a r t [ 1 ] ) ∗∗2)
24 b = math . sqrt (( f a r [ 0 ] − s t a r t [ 0 ] ) ∗∗2 + ( f a r [ 1 ] − s t a r t [ 1 ] ) ∗∗2)
25 c = math . sqrt (( end [ 0 ] − f a r [ 0 ] ) ∗∗2 + ( end [ 1 ] − f a r [ 1 ] ) ∗∗2)
26 angle = math . acos (( b∗∗2 + c ∗∗2 − a ∗∗2) / (2∗b∗c ) )
27 return angle
When counting the convexity defects, we have to impose some limitations.
We do not want to take all the convexity defects especially when there is a
distortion on the contour. The limitations include the depth of the defects
has to be larger than a certain value (10000 in the example above). We exclude
small defects which are probably not the fingers. Besides, we exclude the defects
which are wider than 90 degrees. We calculate the angle by using cosine rule.
The output will be the number of convexity defects. For example, if the
number of convexity defects is two, then the number of fingers raised is three.
However, using only these, we cannot differentiate between no finger raised and
one finger raised. This can be solved by calculating the distance between the
centroid of the contour and the highest point of the contour. If it is larger than
a certain distance, the number of fingers raised is one, else no finger raised.
4.2 Convolutional Neural Network
Using a Convolutional Neural Network (CNN) actually simplifies a lot of work.
Keras in python is a good option. It is relatively simple. Due to limited GPU
memory, I have resized the video frame from 260 × 260 to 28 × 28. Feel free to
try giving the original size as the input to the CNN. Below is how I construct
my CNN model.
1 model = Sequential ()
2 model . add (Conv2D(32 , (3 ,3) , a c t i v a t i o n =’ relu ’ , input shape =(28 , 28 ,
1) ) )
3 model . add (Conv2D(64 , (3 ,3) , a c t i v a t i o n =’ relu ’ ) )
6

4 model . add ( MaxPooling2D ( ( 2 , 2 ) ) )
5 model . add ( Dropout ( 0 . 2 5 ) )
6 model . add ( Flatten () )
7 model . add ( Dense (128 , a c t i v a t i o n =’ relu ’ ) )
8 model . add ( Dropout ( 0 . 5 ) )
9 model . add ( Dense (6 , a c t i v a t i o n =’softmax ’ ) )
I have trained my model using approximately 1000 images per class and
200 images for testing. I have done some rotation, shifting and ﬂipping on the
training images. For more details, check out the source code.
I try to balance the number of training images to prevent a bias in the model.
7

Here are the samples of my training data.
The result looks pretty good. It achieves validation accurracy of 99% at the
ﬁfth epoch. However, the model is trained and tested on my own hand. It might
not generalize to other people’s hand. Therefore, I’m not posting my model.
8

Finally, we can load the model and predict the result.
1 from keras . models import load model
2
3 model = load model (” model 1 . h5 ”)
4
5 modelInput = cv2 . r e s i z e ( thresh , (28 , 28) )
6 modelInput = np . expand dims ( modelInput , axis=−1)
7 modelInput = np . expand dims ( modelInput , axis =0)
8
9 pred = s e l f . model . p r e d i c t ( modelInput )
10 pred = np . argmax ( pred [ 0 ] )
9

5 Conclusion
Using the result from detection, we can use it as a command to interact with
the computers (I have done key pressing in the source code). Of course, you
can do more than that. However, there are still many improvements needed to
make this application practical. Feel free to improve it.
References
[1] lzane/Fingers-Detection-using-OpenCV-and-Python. Retrieved from
https://github.com/lzane/Fingers-Detection-using-OpenCV-and-Python
[2] amarlearning/Finger-Detection-and-Tracking. Retrieved from
https://github.com/amarlearning/Finger-Detection-and-Tracking
10

Finger detection

More Related Content

What's hot (20)

Similar to Finger detection (20)

Recently uploaded (20)

Finger detection