Support Vector Machine Derivative

SVM Derivative
Saif
February 21, 2018
Let us start with the loss function of Support Vector Machine. SVM just wants the
score of correct class to be higher than incorrect class by the margin (delta).
SVM Loss Function
Li = max(0, sj − syi + ∆]) (1)
syi = scoreofcorrectclass
sj = scoreofincorrectclass.
(2)
We sum across all incorrect classes. Because this loss function is the convex function,
we can use derivative and optimization technique to reach at global minimum. This is the
reason we calculate derivative which can be either done using calculus (analytically) or
numerically. We can use both the techniques to check if we have calculated our gradient
correctly.
Here we will discuss about calculating gradient using calculus.
Let us begin for single data point.
x11 x12 x13 .


w11 w21 w31
w12 w22 w32
w13 w23 w33


K=3 (Number of Classes) so at the end of calculation we expect 3 scores as output for
1 example
D=3(bias term added)
Dimension for single data point is 1x3 and the dimensions of Weights are 3x3 You can
use diﬀerent ways for calculation and for taking dot product.
Score1 = w11x11 + w12x12 + w13x13 (3)
Score2 = w21x11 + w22x12 + w23x13 (4)
1

These 2 represents score for 2 classes and in the similar manner you can calculate score
for class 3 for particular example.
Loss Function (Let us say Class 2 is the correct class)
f = max(0, ((w11x11 + w12x12 + w13x13) − (w21x11 + w22x12 + w23x13) + 1)
+ max(0, (w31x11 + w32x12 + w33x13) − (w21x11 + w22x12 + .w23x13) + 1))
(5)
If we arrange this equation in proper manner and gather like terms together this be-
comes
f = max(0, (w11x11 − w21x11) + (w12x12 − w22x12) + (w13x13 − w23x13) + 1)+
max(0, (w31x11 − w21x11) + (w32x12 − w22x12) + (w33x13 − w23x13) + 1)
(6)
Now we have equation of loss function in simpler form and we can calculate derivative
with respect to each weights. (Check if margin is violated and then calculate derivative).
∂f
∂w11
= x11
∂f
∂w12
= x12
∂f
∂w13
= x13
(7)
This particular example has class 2 as correct class so let us calculate derivative with
respect to w2
∂f
∂w21
= −x11 − x11.... = −x11 ∗ NumberoftimesmarginV iolated (8)
∂f
∂w22
= −x12 − x12.... = −x12 ∗ NumberoftimesmarginV iolated
.
.
(9)
So if we see here, we have two cases for calculating derivative. We can generalise this
easily so as you see right over here that if my class 2 is the correct class we will calculate
derivative with respect to w2 diﬀerently than w1 and w3. You will calculate derivative with
respect to w3 in the same manner as we calculated derivative with respect to w1 above.
You can derive rest of it with respect to several Ws. I hope that you have clear idea
now on how to calculate derivative of SVM. You can extend this concept for single exam-
ple for multiple examples.
2

Support Vector Machine Derivative

More Related Content

What's hot (20)

Similar to Support Vector Machine Derivative (11)

Recently uploaded (20)

Support Vector Machine Derivative