SlideShare a Scribd company logo
Lecture 6: Minimum encoding ball and Support vector
data description (SVDD)
Stéphane Canu
stephane.canu@litislab.eu
Sao Paulo 2014
May 12, 2014
Plan
1 Support Vector Data Description (SVDD)
SVDD, the smallest enclosing ball problem
The minimum enclosing ball problem with errors
The minimum enclosing ball problem in a RKHS
The two class Support vector data description (SVDD)
The minimum enclosing ball problem [Tax and Duin, 2004]
Stéphane Canu (INSA Rouen - LITIS) May 12, 2014 3 / 35
The minimum enclosing ball problem [Tax and Duin, 2004]
the center
Stéphane Canu (INSA Rouen - LITIS) May 12, 2014 3 / 35
The minimum enclosing ball problem [Tax and Duin, 2004]
the radius
Given n points, {xi , i = 1, n} .
min
R∈IR,c∈IRd
R2
with xi − c 2
≤ R2
, i = 1, . . . , n
What is that in the convex programming hierarchy?
LP, QP, QCQP, SOCP and SDP
Stéphane Canu (INSA Rouen - LITIS) May 12, 2014 3 / 35
The convex programming hierarchy (part of)
LP



min
x
f x
with Ax ≤ d
and 0 ≤ x
QP
min
x
1
2 x Gx + f x
with Ax ≤ d
QCQP



min
x
1
2 x Gx + f x
with x Bi x + ai x ≤ di
i = 1, n
SOCP



min
x
f x
with x − ai ≤ bi x + di
i = 1, n
The convex programming hierarchy?
Model generality: LP < QP < QCQP < SOCP < SDP
Stéphane Canu (INSA Rouen - LITIS) May 12, 2014 4 / 35
MEB as a QP in the primal
Theorem (MEB as a QP)
The two following problems are equivalent,
min
R∈IR,c∈IRd
R2
with xi − c 2
≤ R2
, i = 1, . . . , n
min
w,ρ
1
2 w 2
− ρ
with w xi ≥ ρ + 1
2 xi
2
with ρ = 1
2 ( c 2
− R2
) and w = c.
Proof:
xi − c 2
≤ R2
xi
2
− 2xi c + c 2
≤ R2
−2xi c ≤ R2
− xi
2
− c 2
2xi c ≥ −R2
+ xi
2
+ c 2
xi c ≥
1
2
( c 2
− R2
)
ρ
+1
2 xi
2
Stéphane Canu (INSA Rouen - LITIS) May 12, 2014 5 / 35
MEB and the one class SVM
SVDD:
min
w,ρ
1
2 w 2
− ρ
with w xi ≥ ρ + 1
2 xi
2
SVDD and linear OCSVM (Supporting Hyperplane)
if ∀i = 1, n, xi
2
= constant, it is the the linear one class SVM (OC SVM)
The linear one class SVM [Schölkopf and Smola, 2002]
min
w,ρ
1
2 w 2
− ρ
with w xi ≥ ρ
with ρ = ρ + 1
2 xi
2
⇒ OC SVM is a particular case of SVDD
Stéphane Canu (INSA Rouen - LITIS) May 12, 2014 6 / 35
When ∀i = 1, n, xi
2
= 1
0
c
xi − c 2
≤ R2
⇔ w xi ≥ ρ
with
ρ =
1
2
( c 2
− R + 1)
SVDD and OCSVM
"Belonging to the ball" is also "being above" an hyperplane
Stéphane Canu (INSA Rouen - LITIS) May 12, 2014 7 / 35
MEB: KKT
L(c, R, α) = R2
+
n
i=1
αi xi − c 2
− R2
KKT conditionns :
stationarty 2c
n
i=1
αi − 2
n
i=1
αi xi = 0 ← The representer theorem
1 −
n
i=1
αi = 0
primal admiss. xi − c 2
≤ R2
dual admiss. αi ≥ 0 i = 1, n
complementarity αi xi − c 2
− R2
= 0 i = 1, n
Stéphane Canu (INSA Rouen - LITIS) May 12, 2014 8 / 35
MEB: KKT
the radius
L(c, R, α) = R2
+
n
i=1
αi xi − c 2
− R2
KKT conditionns :
stationarty 2c
n
i=1
αi − 2
n
i=1
αi xi = 0 ← The representer theorem
1 −
n
i=1
αi = 0
primal admiss. xi − c 2
≤ R2
dual admiss. αi ≥ 0 i = 1, n
complementarity αi xi − c 2
− R2
= 0 i = 1, n
Complementarity tells us: two groups of points
the support vectors xi − c 2
= R2
and the insiders αi = 0
Stéphane Canu (INSA Rouen - LITIS) May 12, 2014 8 / 35
MEB: Dual
The representer theorem:
c =
n
i=1
αi xi
n
i=1
αi
=
n
i=1
αi xi
L(α) =
n
i=1
αi xi −
n
j=1
αj xj
2
n
i=1
n
j=1
αi αj xi xj = α Gα and
n
i=1
αi xi xi = α diag(G)
with G = XX the Gram matrix: Gij = xi xj ,



min
α∈IRn
α Gα − α diag(G)
with e α = 1
and 0 ≤ αi , i = 1 . . . n
Stéphane Canu (INSA Rouen - LITIS) May 12, 2014 9 / 35
SVDD primal vs. dual
Primal



min
R∈IR,c∈IRd
R2
with xi − c 2 ≤ R2,
i = 1, . . . , n
d + 1 unknown
n constraints
can be recast as a QP
perfect when d << n
Dual



min
α
α Gα − α diag(G)
with e α = 1
and 0 ≤ αi ,
i = 1 . . . n
n unknown with G the pairwise
influence Gram matrix
n box constraints
easy to solve
to be used when d > n
SVDD primal vs. dual
Primal



min
R∈IR,c∈IRd
R2
with xi − c 2 ≤ R2,
i = 1, . . . , n
d + 1 unknown
n constraints
can be recast as a QP
perfect when d << n
Dual



min
α
α Gα − α diag(G)
with e α = 1
and 0 ≤ αi ,
i = 1 . . . n
n unknown with G the pairwise
influence Gram matrix
n box constraints
easy to solve
to be used when d > n
But where is R2?
Looking for R2
min
α
α Gα − α diag(G)
with e α = 1, 0 ≤ αi , i = 1, n
The Lagrangian: L(α, µ, β) = α Gα − α diag(G) + µ(e α − 1) − β α
Stationarity cond.: αL(α, µ, β) = 2Gα − diag(G) + µe − β = 0
The bi dual
min
α
α Gα + µ
with −2Gα + diag(G) ≤ µe
by identification
R2
= µ + α Gα = µ + c 2
µ is the Lagrange multiplier associated with the equality constraint
n
i=1
αi = 1
Also, because of the complementarity condition, if xi is a support vector, then
βi = 0 implies αi > 0 and R2
= xi − c 2
.
Plan
1 Support Vector Data Description (SVDD)
SVDD, the smallest enclosing ball problem
The minimum enclosing ball problem with errors
The minimum enclosing ball problem in a RKHS
The two class Support vector data description (SVDD)
Stéphane Canu (INSA Rouen - LITIS) May 12, 2014 12 / 35
The minimum enclosing ball problem with errors
the slack
The same road map:
initial formuation
reformulation (as a QP)
Lagrangian, KKT
dual formulation
bi dual
Initial formulation: for a given C



min
R,a,ξ
R2
+ C
n
i=1
ξi
with xi − c 2
≤ R2
+ ξi , i = 1, . . . , n
and ξi ≥ 0, i = 1, . . . , n
Stéphane Canu (INSA Rouen - LITIS) May 12, 2014 13 / 35
The MEB with slack: QP, KKT, dual and R2
SVDD as a QP:



min
w,ρ
1
2 w 2
− ρ + C
2
n
i=1
ξi
with w xi ≥ ρ + 1
2 xi
2
− 1
2 ξi
and ξi ≥ 0,
i = 1, n
again with OC SVM as a particular case.
With G = XX
Dual SVDD:



min
α
α Gα − α diag(G)
with e α = 1
and 0 ≤ αi ≤ C,
i = 1, n
for a given C ≤ 1. If C is larger than one it is useless (it’s the no slack case)
R2
= µ + c c
with µ denoting the Lagrange multiplier associated with the equality
constraint n
i=1 αi = 1.
Variations over SVDD
Adaptive SVDD: the weighted error case for given wi , i = 1, n



min
c∈IRp,R∈IR,ξ∈IRn
R + C
n
i=1
wi ξi
with xi − c 2
≤ R+ξi
ξi ≥ 0 i = 1, n
The dual of this problem is a QP [see for instance Liu et al., 2013]
min
α∈IRn
α XX α − α diag(XX )
with
n
i=1 αi = 1 0 ≤ αi ≤ Cwi i = 1, n
Density induced SVDD (D-SVDD):



min
c∈IRp,R∈IR,ξ∈IRn
R + C
n
i=1
ξi
with wi xi − c 2
≤ R+ξi
ξi ≥ 0 i = 1, n
Plan
1 Support Vector Data Description (SVDD)
SVDD, the smallest enclosing ball problem
The minimum enclosing ball problem with errors
The minimum enclosing ball problem in a RKHS
The two class Support vector data description (SVDD)
Stéphane Canu (INSA Rouen - LITIS) May 12, 2014 16 / 35
SVDD in a RKHS
The feature map: IRp
−→ H
c −→ f (•)
xi −→ k(xi , •)
xi − c IRp ≤ R2
−→ k(xi , •) − f (•) 2
H ≤ R2
Kernelized SVDD (in a RKHS) is also a QP



min
f ∈H,R∈IR,ξ∈IRn
R2
+ C
n
i=1
ξi
with k(xi , •) − f (•) 2
H ≤ R2
+ξi i = 1, n
ξi ≥ 0 i = 1, n
SVDD in a RKHS: KKT, Dual and R2
L = R2
+ C
n
i=1
ξi +
n
i=1
αi k(xi , .) − f (.) 2
H − R2
−ξi −
n
i=1
βi ξi
= R2
+ C
n
i=1
ξi +
n
i=1
αi k(xi , xi ) − 2f (xi ) + f 2
H − R2
−ξi −
n
i=1
βi ξi
KKT conditions
Stationarity
2f (.)
n
i=1 αi − 2
n
i=1 αi k(., xi ) = 0 ← The representer theorem
1 −
n
i=1 αi = 0
C − αi − βi = 0
Primal admissibility: k(xi , .) − f (.) 2
≤ R2
+ ξi , ξi ≥ 0
Dual admissibility: αi ≥ 0 , βi ≥ 0
Complementarity
αi k(xi , .) − f (.) 2
− R2
− ξi = 0
βi ξi = 0
SVDD in a RKHS: Dual and R2
L(α) =
n
i=1
αi k(xi , xi ) − 2
n
i=1
f (xi ) + f 2
H with f (.) =
n
j=1
αj k(., xj )
=
n
i=1
αi k(xi , xi ) −
n
i=1
n
j=1
αi αj k(xi , xj )
Gij
Gij = k(xi , xj ) 


min
α
α Gα − α diag(G)
with e α = 1
and 0 ≤ αi ≤ C, i = 1 . . . n
As it is in the linear case:
R2
= µ + f 2
H
with µ denoting the Lagrange multiplier associated with the equality
constraint n
i=1 αi = 1.
SVDD train and val in a RKHS
Train using the dual form (in: G, C; out: α, µ)



min
α
α Gα − α diag(G)
with e α = 1
and 0 ≤ αi ≤ C, i = 1 . . . n
Val with the center in the RKHS: f (.) =
n
i=1 αi k(., xi )
φ(x) = k(x, .) − f (.) 2
H − R2
= k(x, .) 2
H − 2 k(x, .), f (.) H + f (.) 2
H − R2
= k(x, x) − 2f (x) + R2
− µ − R2
= −2f (x) + k(x, x) − µ
= −2
n
i=1
αi k(x, xi ) + k(x, x) − µ
φ(x) = 0 is the decision border
Stéphane Canu (INSA Rouen - LITIS) May 12, 2014 20 / 35
An important theoretical result
For a well-calibrated bandwidth,
The SVDD estimates the underlying distribution level set [Vert and Vert,
2006]
The level sets of a probability density function IP(x) are the set
Cp = {x ∈ IRd
| IP(x) ≥ p}
It is well estimated by the empirical minimum volume set
Vp = {x ∈ IRd
| k(x, .) − f (.) 2
H − R2
≥ 0}
The frontiers coincides
SVDD: the generalization error
For a well-calibrated bandwidth,
(x1, . . . , xn) i.i.d. from some fixed but unknown IP(x)
Then [Shawe-Taylor and Cristianini, 2004] with probability at least 1 − δ,
(∀δ ∈]0, 1[), for any margin m > 0
IP k(x, .) − f (.) 2
H ≥ R2
+ m ≤
1
mn
n
i=1
ξi +
6R2
m
√
n
+ 3
ln(2/δ)
2n
Equivalence between SVDD and OCSVM for translation
invariant kernels (diagonal constant kernels)
Theorem
Let H be a RKHS on some domain X endowed with kernel k. If there
exists some constant c such that ∀x ∈ X, k(x, x) = c, then the two
following problems are equivalent,



min
f ,R,ξ
R + C
n
i=1
ξi
with k(xi , .) − f (.) 2
H ≤ R+ξi
ξi ≥ 0 i = 1, n



min
f ,ρ,ξ
1
2 f 2
H − ρ + C
n
i=1
εi
with f (xi ) ≥ ρ − εi
εi ≥ 0 i = 1, n
with ρ = 1
2(c + f 2
H − R) and εi = 1
2ξi .
Proof of the Equivalence between SVDD and OCSVM



min
f ∈H,R∈IR,ξ∈IRn
R + C
n
i=1
ξi
with k(xi , .) − f (.) 2
H ≤ R+ξi , ξi ≥ 0 i = 1, n
since k(xi , .) − f (.) 2
H = k(xi , xi ) + f 2
H − 2f (xi )



min
f ∈H,R∈IR,ξ∈IRn
R + C
n
i=1
ξi
with 2f (xi ) ≥ k(xi , xi ) + f 2
H − R−ξi , ξi ≥ 0 i = 1, n.
Introducing ρ = 1
2 (c + f 2
H − R) that is R = c + f 2
H − 2ρ, and since k(xi , xi )
is constant and equals to c the SVDD problem becomes



min
f ∈H,ρ∈IR,ξ∈IRn
1
2 f 2
H − ρ + C
2
n
i=1
ξi
with f (xi ) ≥ ρ−1
2 ξi , ξi ≥ 0 i = 1, n
leading to the classical one class SVM formulation (OCSVM)



min
f ∈H,ρ∈IR,ξ∈IRn
1
2 f 2
H − ρ + C
n
i=1
εi
with f (xi ) ≥ ρ − εi , εi ≥ 0 i = 1, n
with εi = 1
2 ξi . Note that by putting ν = 1
nC we can get the so called ν
formulation of the OCSVM



min
f ∈H,ρ ∈IR,ξ ∈IRn
1
2 f 2
H − nνρ +
n
i=1
ξi
with f (xi ) ≥ ρ − ξi , ξi ≥ 0 i = 1, n
with f = Cf , ρ = Cρ, and ξ = Cξ.
Duality
Note that the dual of the SVDD is
min
α∈IRn
α Gα − α g
with n
i=1 αi = 1 0 ≤ αi ≤ C i = 1, n
where G is the kernel matrix of general term Gi,j = k(xi , xj ) and g the
diagonal vector such that gi = k(xi , xi ) = c. The dual of the OCSVM is
the following equivalent QP
min
α∈IRn
1
2α Gα
with n
i=1 αi = 1 0 ≤ αi ≤ C i = 1, n
Both dual forms provide the same solution α, but not the same Lagrange
multipliers. ρ is the Lagrange multiplier of the equality constraint of the
dual of the OCSVM and R = c + α Gα − 2ρ. Using the SVDD dual, it
turns out that R = λeq + α Gα where λeq is the Lagrange multiplier of
the equality constraint of the SVDD dual form.
Plan
1 Support Vector Data Description (SVDD)
SVDD, the smallest enclosing ball problem
The minimum enclosing ball problem with errors
The minimum enclosing ball problem in a RKHS
The two class Support vector data description (SVDD)
Stéphane Canu (INSA Rouen - LITIS) May 12, 2014 27 / 35
The two class Support vector data description (SVDD)
−4 −3 −2 −1 0 1 2 3
−3
−2
−1
0
1
2
3
4
−4 −3 −2 −1 0 1 2 3
−3
−2
−1
0
1
2
3
4
.



min
c,R,ξ+,ξ−
R2
+C
yi =1
ξ+
i +
yi =−1
ξ−
i
with xi − c 2
≤ R2
+ξ+
i , ξ+
i ≥ 0 i such that yi = 1
and xi − c 2
≥ R2
−ξ−
i , ξ−
i ≥ 0 i such that yi = −1
Stéphane Canu (INSA Rouen - LITIS) May 12, 2014 28 / 35
The two class SVDD as a QP



min
c,R,ξ+,ξ−
R2
+C
yi =1
ξ+
i +
yi =−1
ξ−
i
with xi − c 2
≤ R2
+ξ+
i , ξ+
i ≥ 0 i such that yi = 1
and xi − c 2
≥ R2
−ξ−
i , ξ−
i ≥ 0 i such that yi = −1
xi
2
− 2xi c + c 2
≤ R2
+ξ+
i , ξ+
i ≥ 0 i such that yi = 1
xi
2
− 2xi c + c 2
≥ R2
−ξ−
i , ξ−
i ≥ 0 i such that yi = −1
2xi c ≥ c 2
− R2
+ xi
2
−ξ+
i , ξ+
i ≥ 0 i such that yi = 1
−2xi c ≥ − c 2
+ R2
− xi
2
−ξ−
i , ξ−
i ≥ 0 i such that yi = −1
2yi xi c ≥ yi ( c 2
− R2
+ xi
2
)−ξi , ξi ≥ 0 i = 1, n
change variable: ρ = c 2 − R2



min
c,ρ,ξ
c 2
− ρ + C
n
i=1 ξi
with 2yi xi c ≥ yi (ρ − xi
2
)−ξi i = 1, n
and ξi ≥ 0 i = 1, n
The dual of the two class SVDD
Gij = yi yj xi xj
The dual formulation:



min
α∈IRn
α Gα −
n
i=1 αi yi xi
2
with
n
i=1
yi αi = 1
0 ≤ αi ≤ C i = 1, n
Stéphane Canu (INSA Rouen - LITIS) May 12, 2014 30 / 35
The two class SVDD vs. one class SVDD
The two class SVDD (left) vs. the one class SVDD (right)
Stéphane Canu (INSA Rouen - LITIS) May 12, 2014 31 / 35
Small Sphere and Large Margin (SSLM) approach
Support vector data description with margin [Wu and Ye, 2009]



min
w,R,ξ∈IRn
R2
+C
yi =1
ξ+
i +
yi =−1
ξ−
i
with xi − c 2
≤ R2
− 1+ξ+
i , ξ+
i ≥ 0 i such that yi = 1
and xi − c 2
≥ R2
+ 1−ξ−
i , ξ−
i ≥ 0 i such that yi = −1
xi − c 2
≥ R2
+ 1−ξ−
i and yi = −1 ⇐⇒ yi xi − c 2
≤ yi R2
− 1+ξ−
i
L(c, R, ξ, α, β) = R2
+C
n
i=1
ξi +
n
i=1
αi yi xi − c 2
− yi R2
+ 1−ξi −
n
i=1
βi ξi
−4 −3 −2 −1 0 1 2 3
−3
−2
−1
0
1
2
3
4
SVDD with margin – dual formulation
L(c, R, ξ, α, β) = R2
+C
n
i=1
ξi +
n
i=1
αi yi xi − c 2
− yi R2
+ 1−ξi −
n
i=1
βi ξi
Optimality: c =
n
i=1
αi yi xi ;
n
i=1
αi yi = 1 ; 0 ≤ αi ≤ C
L(α) =
n
i=1
αi yi xi −
n
j=1
αi yj xj
2
+
n
i=1
αi
= −
n
i=1
n
j=1
αj αi yi yj xj xi +
n
i=1
xi
2
yi αi +
n
i=1
αi
Dual SVDD is also a quadratic program
problem D



min
α∈IRn
α Gα − e α − f α
with y α = 1
and 0 ≤ αi ≤ C i = 1, n
with G a symmetric matrix n × n such that Gij = yi yj xj xi and fi = xi
2
yi
Conclusion
Applications
outlier detection
change detection
clustering
large number of classes
variable selection, . . .
A clear path
reformulation (to a standart problem)
KKT
Dual
Bidual
a lot of variations
L2
SVDD
two classes non symmetric
two classes in the symmetric classes (SVM)
the multi classes issue
practical problems with translation invariant
kernels
.
Bibliography
Bo Liu, Yanshan Xiao, Longbing Cao, Zhifeng Hao, and Feiqi Deng.
Svdd-based outlier detection on uncertain data. Knowledge and
information systems, 34(3):597–618, 2013.
B. Schölkopf and A. J. Smola. Learning with Kernels. MIT Press, 2002.
John Shawe-Taylor and Nello Cristianini. Kernel methods for pattern
analysis. Cambridge university press, 2004.
David MJ Tax and Robert PW Duin. Support vector data description.
Machine learning, 54(1):45–66, 2004.
Régis Vert and Jean-Philippe Vert. Consistency and convergence rates of
one-class svms and related algorithms. The Journal of Machine Learning
Research, 7:817–854, 2006.
Mingrui Wu and Jieping Ye. A small sphere and large margin approach for
novelty detection using training data with outliers. Pattern Analysis and
Machine Intelligence, IEEE Transactions on, 31(11):2088–2092, 2009.
Stéphane Canu (INSA Rouen - LITIS) May 12, 2014 35 / 35

More Related Content

PDF
[DL輪読会]Generative Models of Visually Grounded Imagination
PDF
拡がるディープラーニングの活用
PDF
サポートベクターマシン(SVM)の数学をみんなに説明したいだけの会
PDF
Map reduce: beyond word count
PDF
はじパタ8章 svm
PDF
NeurIPS2021から見るメタ学習の研究動向 - 第83回人工知能セミナー (2022.3.7)「AIトレンド・トップカンファレンス報告会(NeurI...
PDF
Chapitre 4-Apprentissage non supervisé (1) (1).pdf
PDF
文献紹介:TSM: Temporal Shift Module for Efficient Video Understanding
[DL輪読会]Generative Models of Visually Grounded Imagination
拡がるディープラーニングの活用
サポートベクターマシン(SVM)の数学をみんなに説明したいだけの会
Map reduce: beyond word count
はじパタ8章 svm
NeurIPS2021から見るメタ学習の研究動向 - 第83回人工知能セミナー (2022.3.7)「AIトレンド・トップカンファレンス報告会(NeurI...
Chapitre 4-Apprentissage non supervisé (1) (1).pdf
文献紹介:TSM: Temporal Shift Module for Efficient Video Understanding

What's hot (20)

PDF
[DL輪読会] Learning from Simulated and Unsupervised Images through Adversarial T...
PDF
(DL hacks輪読) Deep Kalman Filters
PDF
辞書ベースのテキストマイニング ML-Ask, J-LIWC, J-MDFを例として
PPTX
[DL輪読会]相互情報量最大化による表現学習
PPTX
DLLab 異常検知ナイト 資料 20180214
PPTX
[卒論]眼球運動測定と眼球運動計測による問題解決プロセスの分析
PPTX
友人関係と感染症伝搬をネットワークで理解する
PDF
連続変量を含む相互情報量の推定
PDF
統計学勉強会#2
PDF
Word Tour: One-dimensional Word Embeddings via the Traveling Salesman Problem...
PDF
Introduction to statistics
PPTX
[DL輪読会]Xception: Deep Learning with Depthwise Separable Convolutions
PDF
最新の異常検知手法(NIPS 2018)
PDF
はじぱた7章F5up
PDF
Skip Connection まとめ(Neural Network)
PDF
2023/06/01 IoT ALGYAN ChatGPT研究会第9弾 資料
PDF
線形?非線形?
PPTX
[DL輪読会]医用画像解析におけるセグメンテーション
PDF
[第2版] Python機械学習プログラミング 第4章
PDF
XGBoostからNGBoostまで
[DL輪読会] Learning from Simulated and Unsupervised Images through Adversarial T...
(DL hacks輪読) Deep Kalman Filters
辞書ベースのテキストマイニング ML-Ask, J-LIWC, J-MDFを例として
[DL輪読会]相互情報量最大化による表現学習
DLLab 異常検知ナイト 資料 20180214
[卒論]眼球運動測定と眼球運動計測による問題解決プロセスの分析
友人関係と感染症伝搬をネットワークで理解する
連続変量を含む相互情報量の推定
統計学勉強会#2
Word Tour: One-dimensional Word Embeddings via the Traveling Salesman Problem...
Introduction to statistics
[DL輪読会]Xception: Deep Learning with Depthwise Separable Convolutions
最新の異常検知手法(NIPS 2018)
はじぱた7章F5up
Skip Connection まとめ(Neural Network)
2023/06/01 IoT ALGYAN ChatGPT研究会第9弾 資料
線形?非線形?
[DL輪読会]医用画像解析におけるセグメンテーション
[第2版] Python機械学習プログラミング 第4章
XGBoostからNGBoostまで
Ad

Viewers also liked (6)

PDF
Lecture10 outilier l0_svdd
PDF
Support Vector Machines for Classification
PDF
PPTX
Anomaly Detection using SIngle Class SVM with Gaussian Kernel
PPTX
ODSC - Neural Networks on AWS Lambda
PPTX
Human values & professional ethics
Lecture10 outilier l0_svdd
Support Vector Machines for Classification
Anomaly Detection using SIngle Class SVM with Gaussian Kernel
ODSC - Neural Networks on AWS Lambda
Human values & professional ethics
Ad

Similar to Lecture6 svdd (20)

PDF
Solutions for Problems from Applied Optimization by Ross Baldick
PDF
Trilinear embedding for divergence-form operators
PDF
Bayesian inference on mixtures
PDF
Empowering Fourier-based Pricing Methods for Efficient Valuation of High-Dime...
PDF
Positive and negative solutions of a boundary value problem for a fractional ...
PDF
A Szemeredi-type theorem for subsets of the unit cube
PDF
Solutions Manual for Calculus Early Transcendentals 10th Edition by Anton
PDF
Calculus Early Transcendentals 10th Edition Anton Solutions Manual
PDF
Practical and Worst-Case Efficient Apportionment
PDF
Hierarchical matrices for approximating large covariance matries and computin...
PDF
QMC Error SAMSI Tutorial Aug 2017
PDF
A Proof of the Generalized Riemann Hypothesis
PDF
A Proof of the Generalized Riemann Hypothesis
PDF
Classification with mixtures of curved Mahalanobis metrics
PDF
Solving integral equations on boundaries with corners, edges, and nearly sing...
PDF
Nonconvex Compressed Sensing with the Sum-of-Squares Method
PDF
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
PDF
IIT Jam math 2016 solutions BY Trajectoryeducation
PDF
Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...
PDF
Shell theory
Solutions for Problems from Applied Optimization by Ross Baldick
Trilinear embedding for divergence-form operators
Bayesian inference on mixtures
Empowering Fourier-based Pricing Methods for Efficient Valuation of High-Dime...
Positive and negative solutions of a boundary value problem for a fractional ...
A Szemeredi-type theorem for subsets of the unit cube
Solutions Manual for Calculus Early Transcendentals 10th Edition by Anton
Calculus Early Transcendentals 10th Edition Anton Solutions Manual
Practical and Worst-Case Efficient Apportionment
Hierarchical matrices for approximating large covariance matries and computin...
QMC Error SAMSI Tutorial Aug 2017
A Proof of the Generalized Riemann Hypothesis
A Proof of the Generalized Riemann Hypothesis
Classification with mixtures of curved Mahalanobis metrics
Solving integral equations on boundaries with corners, edges, and nearly sing...
Nonconvex Compressed Sensing with the Sum-of-Squares Method
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
IIT Jam math 2016 solutions BY Trajectoryeducation
Doubly Accelerated Stochastic Variance Reduced Gradient Methods for Regulariz...
Shell theory

More from Stéphane Canu (10)

PDF
Lecture 2: linear SVM in the Dual
PDF
Lecture8 multi class_svm
PDF
Lecture7 cross validation
PDF
Lecture5 kernel svm
PDF
Lecture4 kenrels functions_rkhs
PDF
Lecture3 linear svm_with_slack
PDF
Lecture 2: linear SVM in the dual
PDF
Lecture 1: linear SVM in the primal
PDF
Lecture9 multi kernel_svm
PDF
Main recsys factorisation
Lecture 2: linear SVM in the Dual
Lecture8 multi class_svm
Lecture7 cross validation
Lecture5 kernel svm
Lecture4 kenrels functions_rkhs
Lecture3 linear svm_with_slack
Lecture 2: linear SVM in the dual
Lecture 1: linear SVM in the primal
Lecture9 multi kernel_svm
Main recsys factorisation

Recently uploaded (20)

PDF
Encapsulation theory and applications.pdf
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPTX
Spectroscopy.pptx food analysis technology
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
cuic standard and advanced reporting.pdf
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Approach and Philosophy of On baking technology
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PPTX
Cloud computing and distributed systems.
Encapsulation theory and applications.pdf
Per capita expenditure prediction using model stacking based on satellite ima...
Spectroscopy.pptx food analysis technology
20250228 LYD VKU AI Blended-Learning.pptx
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Building Integrated photovoltaic BIPV_UPV.pdf
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
cuic standard and advanced reporting.pdf
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Spectral efficient network and resource selection model in 5G networks
Approach and Philosophy of On baking technology
Programs and apps: productivity, graphics, security and other tools
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
The AUB Centre for AI in Media Proposal.docx
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Diabetes mellitus diagnosis method based random forest with bat algorithm
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Reach Out and Touch Someone: Haptics and Empathic Computing
MIND Revenue Release Quarter 2 2025 Press Release
Cloud computing and distributed systems.

Lecture6 svdd

  • 1. Lecture 6: Minimum encoding ball and Support vector data description (SVDD) Stéphane Canu stephane.canu@litislab.eu Sao Paulo 2014 May 12, 2014
  • 2. Plan 1 Support Vector Data Description (SVDD) SVDD, the smallest enclosing ball problem The minimum enclosing ball problem with errors The minimum enclosing ball problem in a RKHS The two class Support vector data description (SVDD)
  • 3. The minimum enclosing ball problem [Tax and Duin, 2004] Stéphane Canu (INSA Rouen - LITIS) May 12, 2014 3 / 35
  • 4. The minimum enclosing ball problem [Tax and Duin, 2004] the center Stéphane Canu (INSA Rouen - LITIS) May 12, 2014 3 / 35
  • 5. The minimum enclosing ball problem [Tax and Duin, 2004] the radius Given n points, {xi , i = 1, n} . min R∈IR,c∈IRd R2 with xi − c 2 ≤ R2 , i = 1, . . . , n What is that in the convex programming hierarchy? LP, QP, QCQP, SOCP and SDP Stéphane Canu (INSA Rouen - LITIS) May 12, 2014 3 / 35
  • 6. The convex programming hierarchy (part of) LP    min x f x with Ax ≤ d and 0 ≤ x QP min x 1 2 x Gx + f x with Ax ≤ d QCQP    min x 1 2 x Gx + f x with x Bi x + ai x ≤ di i = 1, n SOCP    min x f x with x − ai ≤ bi x + di i = 1, n The convex programming hierarchy? Model generality: LP < QP < QCQP < SOCP < SDP Stéphane Canu (INSA Rouen - LITIS) May 12, 2014 4 / 35
  • 7. MEB as a QP in the primal Theorem (MEB as a QP) The two following problems are equivalent, min R∈IR,c∈IRd R2 with xi − c 2 ≤ R2 , i = 1, . . . , n min w,ρ 1 2 w 2 − ρ with w xi ≥ ρ + 1 2 xi 2 with ρ = 1 2 ( c 2 − R2 ) and w = c. Proof: xi − c 2 ≤ R2 xi 2 − 2xi c + c 2 ≤ R2 −2xi c ≤ R2 − xi 2 − c 2 2xi c ≥ −R2 + xi 2 + c 2 xi c ≥ 1 2 ( c 2 − R2 ) ρ +1 2 xi 2 Stéphane Canu (INSA Rouen - LITIS) May 12, 2014 5 / 35
  • 8. MEB and the one class SVM SVDD: min w,ρ 1 2 w 2 − ρ with w xi ≥ ρ + 1 2 xi 2 SVDD and linear OCSVM (Supporting Hyperplane) if ∀i = 1, n, xi 2 = constant, it is the the linear one class SVM (OC SVM) The linear one class SVM [Schölkopf and Smola, 2002] min w,ρ 1 2 w 2 − ρ with w xi ≥ ρ with ρ = ρ + 1 2 xi 2 ⇒ OC SVM is a particular case of SVDD Stéphane Canu (INSA Rouen - LITIS) May 12, 2014 6 / 35
  • 9. When ∀i = 1, n, xi 2 = 1 0 c xi − c 2 ≤ R2 ⇔ w xi ≥ ρ with ρ = 1 2 ( c 2 − R + 1) SVDD and OCSVM "Belonging to the ball" is also "being above" an hyperplane Stéphane Canu (INSA Rouen - LITIS) May 12, 2014 7 / 35
  • 10. MEB: KKT L(c, R, α) = R2 + n i=1 αi xi − c 2 − R2 KKT conditionns : stationarty 2c n i=1 αi − 2 n i=1 αi xi = 0 ← The representer theorem 1 − n i=1 αi = 0 primal admiss. xi − c 2 ≤ R2 dual admiss. αi ≥ 0 i = 1, n complementarity αi xi − c 2 − R2 = 0 i = 1, n Stéphane Canu (INSA Rouen - LITIS) May 12, 2014 8 / 35
  • 11. MEB: KKT the radius L(c, R, α) = R2 + n i=1 αi xi − c 2 − R2 KKT conditionns : stationarty 2c n i=1 αi − 2 n i=1 αi xi = 0 ← The representer theorem 1 − n i=1 αi = 0 primal admiss. xi − c 2 ≤ R2 dual admiss. αi ≥ 0 i = 1, n complementarity αi xi − c 2 − R2 = 0 i = 1, n Complementarity tells us: two groups of points the support vectors xi − c 2 = R2 and the insiders αi = 0 Stéphane Canu (INSA Rouen - LITIS) May 12, 2014 8 / 35
  • 12. MEB: Dual The representer theorem: c = n i=1 αi xi n i=1 αi = n i=1 αi xi L(α) = n i=1 αi xi − n j=1 αj xj 2 n i=1 n j=1 αi αj xi xj = α Gα and n i=1 αi xi xi = α diag(G) with G = XX the Gram matrix: Gij = xi xj ,    min α∈IRn α Gα − α diag(G) with e α = 1 and 0 ≤ αi , i = 1 . . . n Stéphane Canu (INSA Rouen - LITIS) May 12, 2014 9 / 35
  • 13. SVDD primal vs. dual Primal    min R∈IR,c∈IRd R2 with xi − c 2 ≤ R2, i = 1, . . . , n d + 1 unknown n constraints can be recast as a QP perfect when d << n Dual    min α α Gα − α diag(G) with e α = 1 and 0 ≤ αi , i = 1 . . . n n unknown with G the pairwise influence Gram matrix n box constraints easy to solve to be used when d > n
  • 14. SVDD primal vs. dual Primal    min R∈IR,c∈IRd R2 with xi − c 2 ≤ R2, i = 1, . . . , n d + 1 unknown n constraints can be recast as a QP perfect when d << n Dual    min α α Gα − α diag(G) with e α = 1 and 0 ≤ αi , i = 1 . . . n n unknown with G the pairwise influence Gram matrix n box constraints easy to solve to be used when d > n But where is R2?
  • 15. Looking for R2 min α α Gα − α diag(G) with e α = 1, 0 ≤ αi , i = 1, n The Lagrangian: L(α, µ, β) = α Gα − α diag(G) + µ(e α − 1) − β α Stationarity cond.: αL(α, µ, β) = 2Gα − diag(G) + µe − β = 0 The bi dual min α α Gα + µ with −2Gα + diag(G) ≤ µe by identification R2 = µ + α Gα = µ + c 2 µ is the Lagrange multiplier associated with the equality constraint n i=1 αi = 1 Also, because of the complementarity condition, if xi is a support vector, then βi = 0 implies αi > 0 and R2 = xi − c 2 .
  • 16. Plan 1 Support Vector Data Description (SVDD) SVDD, the smallest enclosing ball problem The minimum enclosing ball problem with errors The minimum enclosing ball problem in a RKHS The two class Support vector data description (SVDD) Stéphane Canu (INSA Rouen - LITIS) May 12, 2014 12 / 35
  • 17. The minimum enclosing ball problem with errors the slack The same road map: initial formuation reformulation (as a QP) Lagrangian, KKT dual formulation bi dual Initial formulation: for a given C    min R,a,ξ R2 + C n i=1 ξi with xi − c 2 ≤ R2 + ξi , i = 1, . . . , n and ξi ≥ 0, i = 1, . . . , n Stéphane Canu (INSA Rouen - LITIS) May 12, 2014 13 / 35
  • 18. The MEB with slack: QP, KKT, dual and R2 SVDD as a QP:    min w,ρ 1 2 w 2 − ρ + C 2 n i=1 ξi with w xi ≥ ρ + 1 2 xi 2 − 1 2 ξi and ξi ≥ 0, i = 1, n again with OC SVM as a particular case. With G = XX Dual SVDD:    min α α Gα − α diag(G) with e α = 1 and 0 ≤ αi ≤ C, i = 1, n for a given C ≤ 1. If C is larger than one it is useless (it’s the no slack case) R2 = µ + c c with µ denoting the Lagrange multiplier associated with the equality constraint n i=1 αi = 1.
  • 19. Variations over SVDD Adaptive SVDD: the weighted error case for given wi , i = 1, n    min c∈IRp,R∈IR,ξ∈IRn R + C n i=1 wi ξi with xi − c 2 ≤ R+ξi ξi ≥ 0 i = 1, n The dual of this problem is a QP [see for instance Liu et al., 2013] min α∈IRn α XX α − α diag(XX ) with n i=1 αi = 1 0 ≤ αi ≤ Cwi i = 1, n Density induced SVDD (D-SVDD):    min c∈IRp,R∈IR,ξ∈IRn R + C n i=1 ξi with wi xi − c 2 ≤ R+ξi ξi ≥ 0 i = 1, n
  • 20. Plan 1 Support Vector Data Description (SVDD) SVDD, the smallest enclosing ball problem The minimum enclosing ball problem with errors The minimum enclosing ball problem in a RKHS The two class Support vector data description (SVDD) Stéphane Canu (INSA Rouen - LITIS) May 12, 2014 16 / 35
  • 21. SVDD in a RKHS The feature map: IRp −→ H c −→ f (•) xi −→ k(xi , •) xi − c IRp ≤ R2 −→ k(xi , •) − f (•) 2 H ≤ R2 Kernelized SVDD (in a RKHS) is also a QP    min f ∈H,R∈IR,ξ∈IRn R2 + C n i=1 ξi with k(xi , •) − f (•) 2 H ≤ R2 +ξi i = 1, n ξi ≥ 0 i = 1, n
  • 22. SVDD in a RKHS: KKT, Dual and R2 L = R2 + C n i=1 ξi + n i=1 αi k(xi , .) − f (.) 2 H − R2 −ξi − n i=1 βi ξi = R2 + C n i=1 ξi + n i=1 αi k(xi , xi ) − 2f (xi ) + f 2 H − R2 −ξi − n i=1 βi ξi KKT conditions Stationarity 2f (.) n i=1 αi − 2 n i=1 αi k(., xi ) = 0 ← The representer theorem 1 − n i=1 αi = 0 C − αi − βi = 0 Primal admissibility: k(xi , .) − f (.) 2 ≤ R2 + ξi , ξi ≥ 0 Dual admissibility: αi ≥ 0 , βi ≥ 0 Complementarity αi k(xi , .) − f (.) 2 − R2 − ξi = 0 βi ξi = 0
  • 23. SVDD in a RKHS: Dual and R2 L(α) = n i=1 αi k(xi , xi ) − 2 n i=1 f (xi ) + f 2 H with f (.) = n j=1 αj k(., xj ) = n i=1 αi k(xi , xi ) − n i=1 n j=1 αi αj k(xi , xj ) Gij Gij = k(xi , xj )    min α α Gα − α diag(G) with e α = 1 and 0 ≤ αi ≤ C, i = 1 . . . n As it is in the linear case: R2 = µ + f 2 H with µ denoting the Lagrange multiplier associated with the equality constraint n i=1 αi = 1.
  • 24. SVDD train and val in a RKHS Train using the dual form (in: G, C; out: α, µ)    min α α Gα − α diag(G) with e α = 1 and 0 ≤ αi ≤ C, i = 1 . . . n Val with the center in the RKHS: f (.) = n i=1 αi k(., xi ) φ(x) = k(x, .) − f (.) 2 H − R2 = k(x, .) 2 H − 2 k(x, .), f (.) H + f (.) 2 H − R2 = k(x, x) − 2f (x) + R2 − µ − R2 = −2f (x) + k(x, x) − µ = −2 n i=1 αi k(x, xi ) + k(x, x) − µ φ(x) = 0 is the decision border Stéphane Canu (INSA Rouen - LITIS) May 12, 2014 20 / 35
  • 25. An important theoretical result For a well-calibrated bandwidth, The SVDD estimates the underlying distribution level set [Vert and Vert, 2006] The level sets of a probability density function IP(x) are the set Cp = {x ∈ IRd | IP(x) ≥ p} It is well estimated by the empirical minimum volume set Vp = {x ∈ IRd | k(x, .) − f (.) 2 H − R2 ≥ 0} The frontiers coincides
  • 26. SVDD: the generalization error For a well-calibrated bandwidth, (x1, . . . , xn) i.i.d. from some fixed but unknown IP(x) Then [Shawe-Taylor and Cristianini, 2004] with probability at least 1 − δ, (∀δ ∈]0, 1[), for any margin m > 0 IP k(x, .) − f (.) 2 H ≥ R2 + m ≤ 1 mn n i=1 ξi + 6R2 m √ n + 3 ln(2/δ) 2n
  • 27. Equivalence between SVDD and OCSVM for translation invariant kernels (diagonal constant kernels) Theorem Let H be a RKHS on some domain X endowed with kernel k. If there exists some constant c such that ∀x ∈ X, k(x, x) = c, then the two following problems are equivalent,    min f ,R,ξ R + C n i=1 ξi with k(xi , .) − f (.) 2 H ≤ R+ξi ξi ≥ 0 i = 1, n    min f ,ρ,ξ 1 2 f 2 H − ρ + C n i=1 εi with f (xi ) ≥ ρ − εi εi ≥ 0 i = 1, n with ρ = 1 2(c + f 2 H − R) and εi = 1 2ξi .
  • 28. Proof of the Equivalence between SVDD and OCSVM    min f ∈H,R∈IR,ξ∈IRn R + C n i=1 ξi with k(xi , .) − f (.) 2 H ≤ R+ξi , ξi ≥ 0 i = 1, n since k(xi , .) − f (.) 2 H = k(xi , xi ) + f 2 H − 2f (xi )    min f ∈H,R∈IR,ξ∈IRn R + C n i=1 ξi with 2f (xi ) ≥ k(xi , xi ) + f 2 H − R−ξi , ξi ≥ 0 i = 1, n. Introducing ρ = 1 2 (c + f 2 H − R) that is R = c + f 2 H − 2ρ, and since k(xi , xi ) is constant and equals to c the SVDD problem becomes    min f ∈H,ρ∈IR,ξ∈IRn 1 2 f 2 H − ρ + C 2 n i=1 ξi with f (xi ) ≥ ρ−1 2 ξi , ξi ≥ 0 i = 1, n
  • 29. leading to the classical one class SVM formulation (OCSVM)    min f ∈H,ρ∈IR,ξ∈IRn 1 2 f 2 H − ρ + C n i=1 εi with f (xi ) ≥ ρ − εi , εi ≥ 0 i = 1, n with εi = 1 2 ξi . Note that by putting ν = 1 nC we can get the so called ν formulation of the OCSVM    min f ∈H,ρ ∈IR,ξ ∈IRn 1 2 f 2 H − nνρ + n i=1 ξi with f (xi ) ≥ ρ − ξi , ξi ≥ 0 i = 1, n with f = Cf , ρ = Cρ, and ξ = Cξ.
  • 30. Duality Note that the dual of the SVDD is min α∈IRn α Gα − α g with n i=1 αi = 1 0 ≤ αi ≤ C i = 1, n where G is the kernel matrix of general term Gi,j = k(xi , xj ) and g the diagonal vector such that gi = k(xi , xi ) = c. The dual of the OCSVM is the following equivalent QP min α∈IRn 1 2α Gα with n i=1 αi = 1 0 ≤ αi ≤ C i = 1, n Both dual forms provide the same solution α, but not the same Lagrange multipliers. ρ is the Lagrange multiplier of the equality constraint of the dual of the OCSVM and R = c + α Gα − 2ρ. Using the SVDD dual, it turns out that R = λeq + α Gα where λeq is the Lagrange multiplier of the equality constraint of the SVDD dual form.
  • 31. Plan 1 Support Vector Data Description (SVDD) SVDD, the smallest enclosing ball problem The minimum enclosing ball problem with errors The minimum enclosing ball problem in a RKHS The two class Support vector data description (SVDD) Stéphane Canu (INSA Rouen - LITIS) May 12, 2014 27 / 35
  • 32. The two class Support vector data description (SVDD) −4 −3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3 4 −4 −3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3 4 .    min c,R,ξ+,ξ− R2 +C yi =1 ξ+ i + yi =−1 ξ− i with xi − c 2 ≤ R2 +ξ+ i , ξ+ i ≥ 0 i such that yi = 1 and xi − c 2 ≥ R2 −ξ− i , ξ− i ≥ 0 i such that yi = −1 Stéphane Canu (INSA Rouen - LITIS) May 12, 2014 28 / 35
  • 33. The two class SVDD as a QP    min c,R,ξ+,ξ− R2 +C yi =1 ξ+ i + yi =−1 ξ− i with xi − c 2 ≤ R2 +ξ+ i , ξ+ i ≥ 0 i such that yi = 1 and xi − c 2 ≥ R2 −ξ− i , ξ− i ≥ 0 i such that yi = −1 xi 2 − 2xi c + c 2 ≤ R2 +ξ+ i , ξ+ i ≥ 0 i such that yi = 1 xi 2 − 2xi c + c 2 ≥ R2 −ξ− i , ξ− i ≥ 0 i such that yi = −1 2xi c ≥ c 2 − R2 + xi 2 −ξ+ i , ξ+ i ≥ 0 i such that yi = 1 −2xi c ≥ − c 2 + R2 − xi 2 −ξ− i , ξ− i ≥ 0 i such that yi = −1 2yi xi c ≥ yi ( c 2 − R2 + xi 2 )−ξi , ξi ≥ 0 i = 1, n change variable: ρ = c 2 − R2    min c,ρ,ξ c 2 − ρ + C n i=1 ξi with 2yi xi c ≥ yi (ρ − xi 2 )−ξi i = 1, n and ξi ≥ 0 i = 1, n
  • 34. The dual of the two class SVDD Gij = yi yj xi xj The dual formulation:    min α∈IRn α Gα − n i=1 αi yi xi 2 with n i=1 yi αi = 1 0 ≤ αi ≤ C i = 1, n Stéphane Canu (INSA Rouen - LITIS) May 12, 2014 30 / 35
  • 35. The two class SVDD vs. one class SVDD The two class SVDD (left) vs. the one class SVDD (right) Stéphane Canu (INSA Rouen - LITIS) May 12, 2014 31 / 35
  • 36. Small Sphere and Large Margin (SSLM) approach Support vector data description with margin [Wu and Ye, 2009]    min w,R,ξ∈IRn R2 +C yi =1 ξ+ i + yi =−1 ξ− i with xi − c 2 ≤ R2 − 1+ξ+ i , ξ+ i ≥ 0 i such that yi = 1 and xi − c 2 ≥ R2 + 1−ξ− i , ξ− i ≥ 0 i such that yi = −1 xi − c 2 ≥ R2 + 1−ξ− i and yi = −1 ⇐⇒ yi xi − c 2 ≤ yi R2 − 1+ξ− i L(c, R, ξ, α, β) = R2 +C n i=1 ξi + n i=1 αi yi xi − c 2 − yi R2 + 1−ξi − n i=1 βi ξi −4 −3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3 4
  • 37. SVDD with margin – dual formulation L(c, R, ξ, α, β) = R2 +C n i=1 ξi + n i=1 αi yi xi − c 2 − yi R2 + 1−ξi − n i=1 βi ξi Optimality: c = n i=1 αi yi xi ; n i=1 αi yi = 1 ; 0 ≤ αi ≤ C L(α) = n i=1 αi yi xi − n j=1 αi yj xj 2 + n i=1 αi = − n i=1 n j=1 αj αi yi yj xj xi + n i=1 xi 2 yi αi + n i=1 αi Dual SVDD is also a quadratic program problem D    min α∈IRn α Gα − e α − f α with y α = 1 and 0 ≤ αi ≤ C i = 1, n with G a symmetric matrix n × n such that Gij = yi yj xj xi and fi = xi 2 yi
  • 38. Conclusion Applications outlier detection change detection clustering large number of classes variable selection, . . . A clear path reformulation (to a standart problem) KKT Dual Bidual a lot of variations L2 SVDD two classes non symmetric two classes in the symmetric classes (SVM) the multi classes issue practical problems with translation invariant kernels .
  • 39. Bibliography Bo Liu, Yanshan Xiao, Longbing Cao, Zhifeng Hao, and Feiqi Deng. Svdd-based outlier detection on uncertain data. Knowledge and information systems, 34(3):597–618, 2013. B. Schölkopf and A. J. Smola. Learning with Kernels. MIT Press, 2002. John Shawe-Taylor and Nello Cristianini. Kernel methods for pattern analysis. Cambridge university press, 2004. David MJ Tax and Robert PW Duin. Support vector data description. Machine learning, 54(1):45–66, 2004. Régis Vert and Jean-Philippe Vert. Consistency and convergence rates of one-class svms and related algorithms. The Journal of Machine Learning Research, 7:817–854, 2006. Mingrui Wu and Jieping Ye. A small sphere and large margin approach for novelty detection using training data with outliers. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 31(11):2088–2092, 2009. Stéphane Canu (INSA Rouen - LITIS) May 12, 2014 35 / 35