Non-negative Matrix Factorization:
Applications and Algorithms
Trial Lecture
Akanksha Agrawal
Matrix
1 2 3
1 55 119 11
2 -112 456 154
3 513 33 223
4 324 123 543
4 x 3
A =
Matrix
1 2 3
1 55 119 11
2 -112 456 154
3 513 33 223
4 324 123 543
4 x 3
A =
A;1
Matrix
1 2 3
1 55 119 11
2 -112 456 154
3 513 33 223
4 324 123 543
4 x 3
A =
A2;
Matrix
1 2 3
1 55 119 11
2 -112 456 154
3 513 33 223
4 324 123 543
4 x 3
A =
a32
n x m
= xA W H
Minimize k
n x k
k x m
Factor of a Matrix
n x m
= xA W H
Minimize k
k is the rank, r of A
n x k
k x m
Factor of a Matrix
n x m n x k
= x
A basis
k ≤ r
k x m
W H
Factor of a Matrix
n x m
= x
k ≤ r
n x r
r x m
H
Factor of a Matrix
A basis
n x m
= x
k ≤ r
a1 a2 ar+ + +
n x r
r x m
H
Factor of a Matrix
A basis
j
n x m
= x
k ≤ r
a1 a2 ar+ + +
j
a1
a2
ar
n x r
r x m
Factor of a Matrix
A basis
j
n x m
= x
r ≤ k
n x k
k x m
A H
Factor of a Matrix
n x m
= x
r ≤ k
n x k
k x m
Can obtain a generating
set of the vector space
spanned by columns of A
A H
Factor of a Matrix
Non-Negative Matrix
1 2 3
1 55 119 11
2 -112 456 154
3 513 33 223
4 324 123 543
4 x 3
All elements are non-negative
Non-Negative Matrix
1 2 3
1 55 119 11
2 -112 456 154
3 513 33 223
4 324 123 543
4 x 3
All elements are non-negative
Non-Negative Matrix
1 2 3
1 55 119 11
2 112 456 154
3 513 33 223
4 324 123 543
4 x 3
All elements are non-negative
Non-negative (Exact) Factor of a Non-negative
Matrix
n x m
= xA W H
Minimize k
n x k
k x m
non-negative non-negative non-negative
Non-negative (Exact) Factor of a Non-negative
Matrix
n x m
= xA W H
Minimize k
n x k
k x m
non-negative non-negative non-negative
Non-negative
rank
Decision Version of the Problem
Exact Non-negative Matrix Factorization (ENMF)
Input:
Question:
An n x m non-negative matrix A and an
integer k.
Are there non-negative matrices W and H
such that A = W x H, W is of order
n x k, and H is of order k x m?
(A,k)
n x m
A
a11 a12 a1m
a21 a22 a2m
an1 an2 anm
(Cohen and Rothblum)
A Simple Algorithm for ENMF
A Simple Algorithm for ENMF
= W
x
n x k
w11 w12 w1k
w21 w22 w2k
wn1 wn2 wnk
k x m
h11 h12 h1m
h21 h22 h2m
wk1 wk2 wkm
Create variables
n x m
A
a11 a12 a1m
a21 a22 a2m
an1 an2 anm
H
A Simple Algorithm for ENMF
n x m
A
W
a11 a12 a1m
a21 a22 a2m
an1 an2 anm
x
n x k
w11 w12 w1k
w21 w22 w2k
wn1 wn2 wnk
k x m
h11 h12 h1m
h21 h22 h2m
wk1 wk2 wkm
Create variables
=
H
Create polynomial constraints:
[Const(A,k)]
1. For all i,j wij, hij ≥ 0.
A Simple Algorithm for ENMF
n x m
A
W
a11 a12 a1m
a21 a22 a2m
an1 an2 anm
x
n x k
w11 w12 w1k
w21 w22 w2k
wn1 wn2 wnk
H
k x m
h11 h12 h1m
h21 h22 h2m
wk1 wk2 wkm
Create variables
Create polynomial constraints:
[Const(A,k)]
1. For all i,j wij, hij ≥ 0.
2. For all i,j, aij = wik hkj.∑
k
=
A Simple Algorithm for ENMF
(A,k)
n x m
A
a11 a12 a1m
a21 a22 a2m
an1 an2 anm
Number of variables:
Number of polynomial
constraints:
nk + km
nm + nk + km
(A,k) is a yes-instance of ENMF if and only if
Const(A,k) is satisfiable (over reals)
A Simple Algorithm for ENMF
(A,k)
n x m
A
a11 a12 a1m
a21 a22 a2m
an1 an2 anm
Number of variables:
Number of polynomial
constraints:
nk + km
nm + nk + km
We can find a solution to
a set of polynomial
inequalities in time (Dp)O(x)
A Simple Algorithm for ENMF
(A,k)
n x m
A
a11 a12 a1m
a21 a22 a2m
an1 an2 anm
Number of variables:
Number of polynomial
constraints:
nk + km
nm + nk + km
We can find a solution to
a set of polynomial
inequalities in time (Dp)O(x)
x: number of variables
p: number of inequalities
D: Maximum degree of a
polynomial inequality
A Simple Algorithm for ENMF
(A,k)
n x m
A
a11 a12 a1m
a21 a22 a2m
an1 an2 anm
Number of variables:
Number of polynomial
constraints:
nk + km
nm + nk + km
We can decide if Const(A,k) is
satisfiable in time O((nm)O(k(n+m)))
An Illustration of Variable Reduction
Simplicial Factorization
Input:
Question:
An n x m non-negative matrix A of rank k.
Are there non-negative matrices W and H
such that A = W x H, W is of order n x k,
and H is of order k x m?
n x m
xA W H
n x k
k x m
Rank k
Simplicial Factorization
=
Simplicial Factorization
n x m
= xA W H
n x k
k x m
Rank k Full column rank Full row rank
n x m
x
k ≤ r
n x r
r x m
H
Factor of a Matrix
A basis
=
Simplicial Factorization
n x m
xA W H
n x k
k x m
Rank k Full column rank Full row rank
=
Simplicial Factorization
Goal: To design an algorithm for Simplicial
Factorization that runs in time O((nm)O(r )).2
Simplicial Factorization
Goal: To design an algorithm for Simplicial
Factorization that runs in time O((nm)O(r )).2
Follow similar approach as the algorithm for ENMF, but
apply with reduced number of variables.
Pseudo Inverse
Consider a full column (or row) rank
matrix Mp,q of rank p (q).
M+ has all real entries;
M+ has order q x p;
M+ x M = Iq,q and M x M+ = Ip,p.
The (unique) pseudo inverse M+, of M satisfies
the following:
Simplicial Factorization
n x m
= xA W H
n x k
k x m
Rank k Full column rank Full row rank
Simplicial Factorization
n x m
= xA W H
n x k
k x m
W+
W+ has order k x n and H+ has order m x k;
W+ x W = Ik,k and H x H+ = Ik,k.
H+Pseudo inverse:
Simplicial Factorization
n x m
= xA W H
n x k
k x m
W+ has order k x n and H+ has order m x k;
W+ x W = Ik,k and H x H+ = Ik,k.
A;i
H;i
W+ A;i = W+ W H;i = H;i
Simplicial Factorization
n x m
= xA W H
n x k
k x m
W+ has order k x n and H+ has order m x k;
W+ x W = Ik,k and H x H+ = Ik,k.
A;i
H;i
W+ A;i = W+ W H;i = H;i
Simplicial Factorization
n x m
= xA W H
n x k
k x m
W+ has order k x n and H+ has order m x k;
W+ x W = Ik,k and H x H+ = Ik,k.
A;i
H;i
k x k k x 1
W+ A;i = W+ W H;i = H;i
Simplicial Factorization
n x m
= xA W H
n x k
k x m
W+ has order k x n and H+ has order m x k;
W+ x W = Ik,k and H x H+ = Ik,k.
A;i
H;i
W+ A;i = W+ W H;i = H;i
Simplicial Factorization
n x m
= xA W H
n x k
k x m
W+ has order k x n and H+ has order m x k;
W+ x W = Ik,k and H x H+ = Ik,k.
A;i
W+ A;i = W+ W H;i = H;i
H;i
Simplicial Factorization
n x m
= xA W H
n x k
k x m
W+ has order k x n and H+ has order m x k;
W+ x W = Ik,k and H x H+ = Ik,k.
A;i
H;i
aji Wj;
W+ A;i = W+ W H;i = H;i
Simplicial Factorization
n x m
= xA W H
n x k
k x m
W+ has order k x n and H+ has order m x k;
W+ x W = Ik,k and H x H+ = Ik,k.
A;i
H;i
W+ A;i = W+ W H;i = H;i
Simplicial Factorization
n x m
= xA W H
n x k
k x m
W+ has order k x n and H+ has order m x k;
W+ x W = Ik,k and H x H+ = Ik,k.
A;i
H;i
W+ A;i = W+ W H;i = H;i
Aj; H+ = Wj; H H+ = Wj;
Simplicial Factorization
C = {U1, U2,…, Uk} : A column basis for A.
R = {V1, V2,…, Vk} : A row basis for A.
A
Columns of A
expressed in basic C
a1U1 + a2U2 + … + akUk
j AC
k x mn x m
a1
a2
ak
j
Simplicial Factorization
AC AR
Columns of A
expressed in basic C
Rows of A
expressed in basic R
n x kk x m
C = {U1, U2,…, Uk} : A column basis for A.
R = {V1, V2,…, Vk} : A row basis for A.
Simplicial Factorization
TC AC and AR TR are non-negative;
AR TR TC AC = A.
Lemma: A has a simplicial factor if and only if the for
every column and row basis C and R of A
there are k x k matrices TC and TR such that:
Simplicial Factorization
Lemma: A has a simplicial factor if and only if the for
every column and row basis C and R of A
there are k x k matrices TC and TR such that:
TC AC and AR TR are non-negative;
AR TR TC AC = A.
A has a simplicial factors
by the two conditions and the
construction of AC and AR.
n x k k x m
Simplicial Factorization
TC AC and AR TR are non-negative;
AR TR TC AC = A.
Lemma: A has a simplicial factor if and only if the for
every column and row basis C and R of A
there are k x k matrices TC and TR such that:
A = W x H
n x k k x m
U and V be column and
row basis respectively
Simplicial Factorization
A = W x H
n x k k x m
U and V be column and
row basis respectively
U
n x k
V
k x m
Simplicial Factorization
A = W x H
n x k k x m
U and V be column and
row basis respectively
U
n x k
V
k x m
TC = W+ x U
TR = V x H+
Simplicial Factorization
A = W x H
n x k k x m
U and V be column and
row basis respectively k x k
TC = W+ x U
TR = V x H+
Simplicial Factorization
A = W x H
n x k k x m
U and V be column and
row basis respectively
TC = W+ x U
TR = V x H+
k x k
TC x AC = W+ x U x AC = H
W+ A;i = W+ W H;i = H;i
Aj; H+ = Wj; H H+ = Wj;
(non -ve)
Simplicial Factorization
A = W x H
n x k k x m
U and V be column and
row basis respectively
TC = W+ x U
TR = V x H+
k x k
TC x AC = W+ x U x AC = H
W+ A;i = W+ W H;i = H;i
Aj; H+ = Wj; H H+ = Wj;
AR x TR = AR x V x H+ = W
(non -ve)
(non -ve)
Simplicial Factorization
A = W x H
n x k k x m
U and V be column and
row basis respectively
TC = W+ x U
TR = V x H+
k x k
TC x AC = W+ x U x AC = H (non -ve)
AR x TR = AR x V x H+ = W (non -ve)
TC AC and AR TR are non-negative;
AR TR TC AC = A.
Simplicial Factorization
TC AC and AR TR are non-negative;
AR TR TC AC = A.
Lemma: A has a simplicial factor if and only if the for
every column and row basis C and R of A
there are k x k matrices TC and TR such that:
A Simple Algorithm for ENMF
n x m
A
W
a11 a12 a1m
a21 a22 a2m
an1 an2 anm
x
n x k
w11 w12 w1k
w21 w22 w2k
wn1 wn2 wnk
H
k x m
h11 h12 h1m
h21 h22 h2m
wk1 wk2 wkm
Create variables
Create polynomial constraints:
[Const(A,k)]
1. For all i,j wij, hij ≥ 0.
2. For all i,j, aij = wik hkj.∑
k
=
Simplicial Factorization
TC AC and AR TR are non-negative;
AR TR TC AC = A.
Lemma: A has a simplicial factor if and only if the for
every column and row basis C and R of A
there are k x k matrices TC and TR such that:
U n x k V k x m
n x m
A
a11 a12 a1m
a21 a22 a2m
an1 an2 anm
Simplicial Factorization
n x m
A
a11 a12 a1m
a21 a22 a2m
an1 an2 anm
Simplicial Factorization
AR AC
k x mn x k k x k
w11 w12 w1k
w21 w22 w2k
wk1 wk2 wkk
TR
k x k
h11 h12 h1k
h21 h22 h2k
hk1 hk2 hkk
TC
=
n x m
A
a11 a12 a1m
a21 a22 a2m
an1 an2 anm
Number of variables:
Number of polynomial
constraints:
2k2
poly(n,m,k)
We can find a solution to
a set of polynomial
inequalities in time (Dp)O(x)
x: number of variables
p: number of inequalities
D: Maximum degree of a
polynomial inequality
Simplicial Factorization
n x m
A
a11 a12 a1m
a21 a22 a2m
an1 an2 anm
Number of variables:
Number of polynomial
constraints:
2k2
poly(n,m,k)
Simplicial Factorization
We can solve Simplicial Factorization
in time O((nm)O(k )).
2
Other Results on ENMF
[Vavasis] ENMF is known to be NP-Hard.
Other Results on ENMF
[Vavasis] ENMF is known to be NP-Hard.
[Arora et al.] Assuming ETH, there is no algorithm for
ENMF running in time O((nm)o(k)).
Other Results on ENMF
2
[Vavasis] ENMF is known to be NP-Hard.
[Arora et al.] Assuming ETH, there is no algorithm for
ENMF running in time O((nm)o(k)).
[Moitra] EMNF admits an algorithm running in time
O((nm)O(k )).
n x m
= xA W H
n x k
k x m
non-negative non-negative non-negative
Exact Non-negative Matrix Factorization
For most applications, close
approximation is good enough.
n x m
xA W H
n x k
k x m
non-negative non-negative non-negative
Non-negative Matrix Factorization
For most applications, close
approximation is good enough.
≈
Notions of closeness
Is f symmetric?
Distance
f: M x M —> R [ {1}
Divergence
Example: Distance Function
Square of Euclidean distance:
For matrices A and B (of same order)
|| A - B ||2 = (Aij - Bij)2∑
i,j
|| A - B ||2 = 0 if and only if A = B
Example: Divergence Function
For matrices A and B (of same order)
D(A || B ) = (Aij log (Aij/Bij) - Aij + Bij)∑
i,j
D(A || B ) = 0 if and only if A = B
General Scheme of Algorithm: Non-negative
Matrix Factorization
Input:
Output:
A, W(0), H(0), and t=1.
W and H.
General Scheme of Algorithm: Non-negative
Matrix Factorization
1. Fix H(t-1) and find W(t), such that D(A, W(t)H(t-1)) ≤
D(A, W(t-1)H(t-1)).
2. Fix W(t) and find H(t), such that D(A, W(t)H(t)) ≤ D(A,
W(t)H(t-1)).
3. If convergence satisfied return W and H.
4. t=t+1.
Input:
Output:
A, W(0), H(0), and t=1.
W and H.
While true
Main Challenges in Designing Better NMF
Algorithms
Getting a good seeding for initialisation of W and H.
Main Challenges in Designing Better NMF
Algorithms
Getting a good seeding for initialisation of W and H.
Devising updating rules for W and H at subsequent
iterations.
Main Challenges in Designing Better NMF
Algorithms
Getting a good seeding for initialisation of W and H.
Devising updating rules for W and H at subsequent
iterations.
Selecting distance/ divergence norms based on the
application.
Main Challenges in Designing Better NMF
Algorithms
Getting a good seeding for initialisation of W and H.
Devising updating rules for W and H at subsequent
iterations.
Selecting distance/ divergence norms based on the
application.
Proving/ giving enough evidences for convergence of
the algorithm.
Why are non-
negative matrices and
non-negative
factorisations
important?
Origin of Non-negative Matrix Factorization
Evolved from Principal Component Analysis, which is
used for dimension reduction.
Disadvantage: Both positive and
negative elements appear in
principal components and
coefficients in linear combinations.
Hard to interpret results in
applications like storing pixel
brightness
Applications
Image Processing.
The work due to Lee and Seung (1970)
attracted lot of attention to NMF
Applications
Image Processing.
Data represented as a non-negative matrix of
pixels.
NMF can find A W x H≈
W is the basis matrix, its
column can be regarded as
parts like nose, ear, eye,
etc.
Applications
Image Processing.
Data represented as a non-negative matrix of
pixels.
NMF can find A W x H≈
H encodes weights of each
of basic parts of the face.
Applications
Image Processing.
Data represented as a non-negative matrix of
pixels.
NMF can find A W x H≈
Therefore, we obtain a
compressed from of data.
Applications
Clustering
This is regarded as one of the most
successful applications of NFM.
Applications
Clustering
Data represented as a non-negative matrix of
pixels.
Columns of it are samples
described by n features
Applications
Clustering
Data represented as a non-negative matrix of
pixels.
NMF can find A W x H≈
W which is of order n x k,
k, which denotes the
number of clusters.
Applications
Clustering
Data represented as a non-negative matrix of
pixels.
NMF can find A W x H≈
H is used as the cluster
membership indicator
matrix.
Applications
Clustering
Data represented as a non-negative matrix of
pixels.
NMF can find A W x H≈
Sample i is in cluster j
if Hji is the largest
value in H;i.
Applications
Financial Data Mining
The stock price fluctuations seem to be
dominated by several underlying factors. NMF
has been used to obtain underlying trends
from the stock market data.
Thanks!

More Related Content

PPTX
Introduction to Linear Discriminant Analysis
PDF
Artificial Neural Network Lect4 : Single Layer Perceptron Classifiers
PPTX
PDF
Bias and variance trade off
PPT
Support Vector Machines
PPT
Addition and subtraction with signed magnitude data (mano
PPTX
Logistic regression
PPT
Bayseian decision theory
Introduction to Linear Discriminant Analysis
Artificial Neural Network Lect4 : Single Layer Perceptron Classifiers
Bias and variance trade off
Support Vector Machines
Addition and subtraction with signed magnitude data (mano
Logistic regression
Bayseian decision theory

What's hot (20)

PDF
06 mlp
PPT
Soft Computing-173101
PDF
Graph Theory: Matrix representation of graphs
PPTX
Neuro-Fuzzy Controller
PDF
Nonnegative Matrix Factorization
PDF
CS6702 graph theory and applications notes pdf book
PDF
Csc446: Pattern Recognition
PPTX
PRML Chapter 4
PPTX
Applications of hybrid systems
PPSX
Lasso and ridge regression
PDF
Maximum Likelihood Estimation
PDF
Nonlinear dimension reduction
PPTX
ML - Multiple Linear Regression
PPTX
Chapter 1 Data structure.pptx
PDF
Design & Analysis of Algorithms Lecture Notes
PPTX
Visualization using tSNE
PPT
2.2.ppt.SC
PPTX
A brief introduction to mutual information and its application
PPT
CS8451 - Design and Analysis of Algorithms
PDF
The Magic of Auto Differentiation
06 mlp
Soft Computing-173101
Graph Theory: Matrix representation of graphs
Neuro-Fuzzy Controller
Nonnegative Matrix Factorization
CS6702 graph theory and applications notes pdf book
Csc446: Pattern Recognition
PRML Chapter 4
Applications of hybrid systems
Lasso and ridge regression
Maximum Likelihood Estimation
Nonlinear dimension reduction
ML - Multiple Linear Regression
Chapter 1 Data structure.pptx
Design & Analysis of Algorithms Lecture Notes
Visualization using tSNE
2.2.ppt.SC
A brief introduction to mutual information and its application
CS8451 - Design and Analysis of Algorithms
The Magic of Auto Differentiation
Ad

Similar to Non-negative Matrix Factorization (20)

PDF
Context-Aware Recommender System Based on Boolean Matrix Factorisation
PDF
S1 Dualsimplex
PDF
Matrix Factorisation (and Dimensionality Reduction)
PDF
UNIT I System of Linear Equations.pdfUNIT I System of Linear Equations.pdf
PDF
Matrices & Determinants
PDF
Bounded var
PDF
04 programming 2
PDF
Setting linear algebra problems
DOCX
Matrices and its Applications to Solve Some Methods of Systems of Linear Equa...
DOCX
Matrices and its Applications to Solve Some Methods of Systems of Linear Equa...
DOCX
University of duhok
PDF
APM.pdf
PPTX
Matrices ppt
PDF
D026017036
PDF
Linear Algebra for AI & ML
PDF
Some Multivariate Methods Used by Ecologists
PPTX
Matrix factorization
PPT
Eigen value , eigen vectors, caley hamilton theorem
PPTX
Matrix
Context-Aware Recommender System Based on Boolean Matrix Factorisation
S1 Dualsimplex
Matrix Factorisation (and Dimensionality Reduction)
UNIT I System of Linear Equations.pdfUNIT I System of Linear Equations.pdf
Matrices & Determinants
Bounded var
04 programming 2
Setting linear algebra problems
Matrices and its Applications to Solve Some Methods of Systems of Linear Equa...
Matrices and its Applications to Solve Some Methods of Systems of Linear Equa...
University of duhok
APM.pdf
Matrices ppt
D026017036
Linear Algebra for AI & ML
Some Multivariate Methods Used by Ecologists
Matrix factorization
Eigen value , eigen vectors, caley hamilton theorem
Matrix
Ad

More from AkankshaAgrawal55 (17)

PDF
Guarding Terrains though the Lens of Parameterized Complexity
PDF
Guarding Polygons via CSP
PDF
Polynomial Kernel for Interval Vertex Deletion
PDF
Path Contraction Faster than 2^n
PDF
Ivd soda-2019
PDF
COnflict Free Feedback Vertex Set: A Parameterized Dichotomy
PDF
Simulataneous Feedback Edge Set: A Parameterized Perspective
PDF
Kernels for Deletion to Classes of Acyclic Digraphs
PDF
Kernelization of Cycle Packing with Relaxed Disjointness Constraints
PDF
Delaunay Graphs For Various Geometric Objects
PDF
Graph Modification: Beyond the known Boundaries
PDF
Split Contraction: The Untold Story
PDF
Fine Grained Complexity
PDF
On the Parameterized Complexity of Simultaneous Deletion Problems
PDF
Kernel for Chordal Vertex Deletion
PDF
Fine Grained Complexity of Rainbow Coloring and its Variants
PDF
Polylogarithmic approximation algorithm for weighted F-deletion problems
Guarding Terrains though the Lens of Parameterized Complexity
Guarding Polygons via CSP
Polynomial Kernel for Interval Vertex Deletion
Path Contraction Faster than 2^n
Ivd soda-2019
COnflict Free Feedback Vertex Set: A Parameterized Dichotomy
Simulataneous Feedback Edge Set: A Parameterized Perspective
Kernels for Deletion to Classes of Acyclic Digraphs
Kernelization of Cycle Packing with Relaxed Disjointness Constraints
Delaunay Graphs For Various Geometric Objects
Graph Modification: Beyond the known Boundaries
Split Contraction: The Untold Story
Fine Grained Complexity
On the Parameterized Complexity of Simultaneous Deletion Problems
Kernel for Chordal Vertex Deletion
Fine Grained Complexity of Rainbow Coloring and its Variants
Polylogarithmic approximation algorithm for weighted F-deletion problems

Recently uploaded (20)

PDF
Unit 5 Preparations, Reactions, Properties and Isomersim of Organic Compounds...
PPTX
A powerpoint on colorectal cancer with brief background
PPT
Biochemestry- PPT ON Protein,Nitrogenous constituents of Urine, Blood, their ...
PPTX
limit test definition and all limit tests
PPTX
Preformulation.pptx Preformulation studies-Including all parameter
PPTX
Substance Disorders- part different drugs change body
PDF
Integrative Oncology: Merging Conventional and Alternative Approaches (www.k...
PPTX
gene cloning powerpoint for general biology 2
PDF
Science Form five needed shit SCIENEce so
PPT
THE CELL THEORY AND ITS FUNDAMENTALS AND USE
PPTX
Understanding the Circulatory System……..
PDF
From Molecular Interactions to Solubility in Deep Eutectic Solvents: Explorin...
PDF
BET Eukaryotic signal Transduction BET Eukaryotic signal Transduction.pdf
PPT
Enhancing Laboratory Quality Through ISO 15189 Compliance
PDF
GROUP 2 ORIGINAL PPT. pdf Hhfiwhwifhww0ojuwoadwsfjofjwsofjw
PPT
LEC Synthetic Biology and its application.ppt
PPTX
LIPID & AMINO ACID METABOLISM UNIT-III, B PHARM II SEMESTER
PDF
Social preventive and pharmacy. Pdf
PPTX
ELISA(Enzyme linked immunosorbent assay)
PPT
Mutation in dna of bacteria and repairss
Unit 5 Preparations, Reactions, Properties and Isomersim of Organic Compounds...
A powerpoint on colorectal cancer with brief background
Biochemestry- PPT ON Protein,Nitrogenous constituents of Urine, Blood, their ...
limit test definition and all limit tests
Preformulation.pptx Preformulation studies-Including all parameter
Substance Disorders- part different drugs change body
Integrative Oncology: Merging Conventional and Alternative Approaches (www.k...
gene cloning powerpoint for general biology 2
Science Form five needed shit SCIENEce so
THE CELL THEORY AND ITS FUNDAMENTALS AND USE
Understanding the Circulatory System……..
From Molecular Interactions to Solubility in Deep Eutectic Solvents: Explorin...
BET Eukaryotic signal Transduction BET Eukaryotic signal Transduction.pdf
Enhancing Laboratory Quality Through ISO 15189 Compliance
GROUP 2 ORIGINAL PPT. pdf Hhfiwhwifhww0ojuwoadwsfjofjwsofjw
LEC Synthetic Biology and its application.ppt
LIPID & AMINO ACID METABOLISM UNIT-III, B PHARM II SEMESTER
Social preventive and pharmacy. Pdf
ELISA(Enzyme linked immunosorbent assay)
Mutation in dna of bacteria and repairss

Non-negative Matrix Factorization

  • 1. Non-negative Matrix Factorization: Applications and Algorithms Trial Lecture Akanksha Agrawal
  • 2. Matrix 1 2 3 1 55 119 11 2 -112 456 154 3 513 33 223 4 324 123 543 4 x 3 A =
  • 3. Matrix 1 2 3 1 55 119 11 2 -112 456 154 3 513 33 223 4 324 123 543 4 x 3 A = A;1
  • 4. Matrix 1 2 3 1 55 119 11 2 -112 456 154 3 513 33 223 4 324 123 543 4 x 3 A = A2;
  • 5. Matrix 1 2 3 1 55 119 11 2 -112 456 154 3 513 33 223 4 324 123 543 4 x 3 A = a32
  • 6. n x m = xA W H Minimize k n x k k x m Factor of a Matrix
  • 7. n x m = xA W H Minimize k k is the rank, r of A n x k k x m Factor of a Matrix
  • 8. n x m n x k = x A basis k ≤ r k x m W H Factor of a Matrix
  • 9. n x m = x k ≤ r n x r r x m H Factor of a Matrix A basis
  • 10. n x m = x k ≤ r a1 a2 ar+ + + n x r r x m H Factor of a Matrix A basis j
  • 11. n x m = x k ≤ r a1 a2 ar+ + + j a1 a2 ar n x r r x m Factor of a Matrix A basis j
  • 12. n x m = x r ≤ k n x k k x m A H Factor of a Matrix
  • 13. n x m = x r ≤ k n x k k x m Can obtain a generating set of the vector space spanned by columns of A A H Factor of a Matrix
  • 14. Non-Negative Matrix 1 2 3 1 55 119 11 2 -112 456 154 3 513 33 223 4 324 123 543 4 x 3 All elements are non-negative
  • 15. Non-Negative Matrix 1 2 3 1 55 119 11 2 -112 456 154 3 513 33 223 4 324 123 543 4 x 3 All elements are non-negative
  • 16. Non-Negative Matrix 1 2 3 1 55 119 11 2 112 456 154 3 513 33 223 4 324 123 543 4 x 3 All elements are non-negative
  • 17. Non-negative (Exact) Factor of a Non-negative Matrix n x m = xA W H Minimize k n x k k x m non-negative non-negative non-negative
  • 18. Non-negative (Exact) Factor of a Non-negative Matrix n x m = xA W H Minimize k n x k k x m non-negative non-negative non-negative Non-negative rank
  • 19. Decision Version of the Problem Exact Non-negative Matrix Factorization (ENMF) Input: Question: An n x m non-negative matrix A and an integer k. Are there non-negative matrices W and H such that A = W x H, W is of order n x k, and H is of order k x m?
  • 20. (A,k) n x m A a11 a12 a1m a21 a22 a2m an1 an2 anm (Cohen and Rothblum) A Simple Algorithm for ENMF
  • 21. A Simple Algorithm for ENMF = W x n x k w11 w12 w1k w21 w22 w2k wn1 wn2 wnk k x m h11 h12 h1m h21 h22 h2m wk1 wk2 wkm Create variables n x m A a11 a12 a1m a21 a22 a2m an1 an2 anm H
  • 22. A Simple Algorithm for ENMF n x m A W a11 a12 a1m a21 a22 a2m an1 an2 anm x n x k w11 w12 w1k w21 w22 w2k wn1 wn2 wnk k x m h11 h12 h1m h21 h22 h2m wk1 wk2 wkm Create variables = H Create polynomial constraints: [Const(A,k)] 1. For all i,j wij, hij ≥ 0.
  • 23. A Simple Algorithm for ENMF n x m A W a11 a12 a1m a21 a22 a2m an1 an2 anm x n x k w11 w12 w1k w21 w22 w2k wn1 wn2 wnk H k x m h11 h12 h1m h21 h22 h2m wk1 wk2 wkm Create variables Create polynomial constraints: [Const(A,k)] 1. For all i,j wij, hij ≥ 0. 2. For all i,j, aij = wik hkj.∑ k =
  • 24. A Simple Algorithm for ENMF (A,k) n x m A a11 a12 a1m a21 a22 a2m an1 an2 anm Number of variables: Number of polynomial constraints: nk + km nm + nk + km (A,k) is a yes-instance of ENMF if and only if Const(A,k) is satisfiable (over reals)
  • 25. A Simple Algorithm for ENMF (A,k) n x m A a11 a12 a1m a21 a22 a2m an1 an2 anm Number of variables: Number of polynomial constraints: nk + km nm + nk + km We can find a solution to a set of polynomial inequalities in time (Dp)O(x)
  • 26. A Simple Algorithm for ENMF (A,k) n x m A a11 a12 a1m a21 a22 a2m an1 an2 anm Number of variables: Number of polynomial constraints: nk + km nm + nk + km We can find a solution to a set of polynomial inequalities in time (Dp)O(x) x: number of variables p: number of inequalities D: Maximum degree of a polynomial inequality
  • 27. A Simple Algorithm for ENMF (A,k) n x m A a11 a12 a1m a21 a22 a2m an1 an2 anm Number of variables: Number of polynomial constraints: nk + km nm + nk + km We can decide if Const(A,k) is satisfiable in time O((nm)O(k(n+m)))
  • 28. An Illustration of Variable Reduction Simplicial Factorization Input: Question: An n x m non-negative matrix A of rank k. Are there non-negative matrices W and H such that A = W x H, W is of order n x k, and H is of order k x m?
  • 29. n x m xA W H n x k k x m Rank k Simplicial Factorization =
  • 30. Simplicial Factorization n x m = xA W H n x k k x m Rank k Full column rank Full row rank
  • 31. n x m x k ≤ r n x r r x m H Factor of a Matrix A basis =
  • 32. Simplicial Factorization n x m xA W H n x k k x m Rank k Full column rank Full row rank =
  • 33. Simplicial Factorization Goal: To design an algorithm for Simplicial Factorization that runs in time O((nm)O(r )).2
  • 34. Simplicial Factorization Goal: To design an algorithm for Simplicial Factorization that runs in time O((nm)O(r )).2 Follow similar approach as the algorithm for ENMF, but apply with reduced number of variables.
  • 35. Pseudo Inverse Consider a full column (or row) rank matrix Mp,q of rank p (q). M+ has all real entries; M+ has order q x p; M+ x M = Iq,q and M x M+ = Ip,p. The (unique) pseudo inverse M+, of M satisfies the following:
  • 36. Simplicial Factorization n x m = xA W H n x k k x m Rank k Full column rank Full row rank
  • 37. Simplicial Factorization n x m = xA W H n x k k x m W+ W+ has order k x n and H+ has order m x k; W+ x W = Ik,k and H x H+ = Ik,k. H+Pseudo inverse:
  • 38. Simplicial Factorization n x m = xA W H n x k k x m W+ has order k x n and H+ has order m x k; W+ x W = Ik,k and H x H+ = Ik,k. A;i H;i W+ A;i = W+ W H;i = H;i
  • 39. Simplicial Factorization n x m = xA W H n x k k x m W+ has order k x n and H+ has order m x k; W+ x W = Ik,k and H x H+ = Ik,k. A;i H;i W+ A;i = W+ W H;i = H;i
  • 40. Simplicial Factorization n x m = xA W H n x k k x m W+ has order k x n and H+ has order m x k; W+ x W = Ik,k and H x H+ = Ik,k. A;i H;i k x k k x 1 W+ A;i = W+ W H;i = H;i
  • 41. Simplicial Factorization n x m = xA W H n x k k x m W+ has order k x n and H+ has order m x k; W+ x W = Ik,k and H x H+ = Ik,k. A;i H;i W+ A;i = W+ W H;i = H;i
  • 42. Simplicial Factorization n x m = xA W H n x k k x m W+ has order k x n and H+ has order m x k; W+ x W = Ik,k and H x H+ = Ik,k. A;i W+ A;i = W+ W H;i = H;i H;i
  • 43. Simplicial Factorization n x m = xA W H n x k k x m W+ has order k x n and H+ has order m x k; W+ x W = Ik,k and H x H+ = Ik,k. A;i H;i aji Wj; W+ A;i = W+ W H;i = H;i
  • 44. Simplicial Factorization n x m = xA W H n x k k x m W+ has order k x n and H+ has order m x k; W+ x W = Ik,k and H x H+ = Ik,k. A;i H;i W+ A;i = W+ W H;i = H;i
  • 45. Simplicial Factorization n x m = xA W H n x k k x m W+ has order k x n and H+ has order m x k; W+ x W = Ik,k and H x H+ = Ik,k. A;i H;i W+ A;i = W+ W H;i = H;i Aj; H+ = Wj; H H+ = Wj;
  • 46. Simplicial Factorization C = {U1, U2,…, Uk} : A column basis for A. R = {V1, V2,…, Vk} : A row basis for A. A Columns of A expressed in basic C a1U1 + a2U2 + … + akUk j AC k x mn x m a1 a2 ak j
  • 47. Simplicial Factorization AC AR Columns of A expressed in basic C Rows of A expressed in basic R n x kk x m C = {U1, U2,…, Uk} : A column basis for A. R = {V1, V2,…, Vk} : A row basis for A.
  • 48. Simplicial Factorization TC AC and AR TR are non-negative; AR TR TC AC = A. Lemma: A has a simplicial factor if and only if the for every column and row basis C and R of A there are k x k matrices TC and TR such that:
  • 49. Simplicial Factorization Lemma: A has a simplicial factor if and only if the for every column and row basis C and R of A there are k x k matrices TC and TR such that: TC AC and AR TR are non-negative; AR TR TC AC = A. A has a simplicial factors by the two conditions and the construction of AC and AR. n x k k x m
  • 50. Simplicial Factorization TC AC and AR TR are non-negative; AR TR TC AC = A. Lemma: A has a simplicial factor if and only if the for every column and row basis C and R of A there are k x k matrices TC and TR such that: A = W x H n x k k x m U and V be column and row basis respectively
  • 51. Simplicial Factorization A = W x H n x k k x m U and V be column and row basis respectively U n x k V k x m
  • 52. Simplicial Factorization A = W x H n x k k x m U and V be column and row basis respectively U n x k V k x m TC = W+ x U TR = V x H+
  • 53. Simplicial Factorization A = W x H n x k k x m U and V be column and row basis respectively k x k TC = W+ x U TR = V x H+
  • 54. Simplicial Factorization A = W x H n x k k x m U and V be column and row basis respectively TC = W+ x U TR = V x H+ k x k TC x AC = W+ x U x AC = H W+ A;i = W+ W H;i = H;i Aj; H+ = Wj; H H+ = Wj; (non -ve)
  • 55. Simplicial Factorization A = W x H n x k k x m U and V be column and row basis respectively TC = W+ x U TR = V x H+ k x k TC x AC = W+ x U x AC = H W+ A;i = W+ W H;i = H;i Aj; H+ = Wj; H H+ = Wj; AR x TR = AR x V x H+ = W (non -ve) (non -ve)
  • 56. Simplicial Factorization A = W x H n x k k x m U and V be column and row basis respectively TC = W+ x U TR = V x H+ k x k TC x AC = W+ x U x AC = H (non -ve) AR x TR = AR x V x H+ = W (non -ve) TC AC and AR TR are non-negative; AR TR TC AC = A.
  • 57. Simplicial Factorization TC AC and AR TR are non-negative; AR TR TC AC = A. Lemma: A has a simplicial factor if and only if the for every column and row basis C and R of A there are k x k matrices TC and TR such that:
  • 58. A Simple Algorithm for ENMF n x m A W a11 a12 a1m a21 a22 a2m an1 an2 anm x n x k w11 w12 w1k w21 w22 w2k wn1 wn2 wnk H k x m h11 h12 h1m h21 h22 h2m wk1 wk2 wkm Create variables Create polynomial constraints: [Const(A,k)] 1. For all i,j wij, hij ≥ 0. 2. For all i,j, aij = wik hkj.∑ k =
  • 59. Simplicial Factorization TC AC and AR TR are non-negative; AR TR TC AC = A. Lemma: A has a simplicial factor if and only if the for every column and row basis C and R of A there are k x k matrices TC and TR such that:
  • 60. U n x k V k x m n x m A a11 a12 a1m a21 a22 a2m an1 an2 anm Simplicial Factorization
  • 61. n x m A a11 a12 a1m a21 a22 a2m an1 an2 anm Simplicial Factorization AR AC k x mn x k k x k w11 w12 w1k w21 w22 w2k wk1 wk2 wkk TR k x k h11 h12 h1k h21 h22 h2k hk1 hk2 hkk TC =
  • 62. n x m A a11 a12 a1m a21 a22 a2m an1 an2 anm Number of variables: Number of polynomial constraints: 2k2 poly(n,m,k) We can find a solution to a set of polynomial inequalities in time (Dp)O(x) x: number of variables p: number of inequalities D: Maximum degree of a polynomial inequality Simplicial Factorization
  • 63. n x m A a11 a12 a1m a21 a22 a2m an1 an2 anm Number of variables: Number of polynomial constraints: 2k2 poly(n,m,k) Simplicial Factorization We can solve Simplicial Factorization in time O((nm)O(k )). 2
  • 64. Other Results on ENMF [Vavasis] ENMF is known to be NP-Hard.
  • 65. Other Results on ENMF [Vavasis] ENMF is known to be NP-Hard. [Arora et al.] Assuming ETH, there is no algorithm for ENMF running in time O((nm)o(k)).
  • 66. Other Results on ENMF 2 [Vavasis] ENMF is known to be NP-Hard. [Arora et al.] Assuming ETH, there is no algorithm for ENMF running in time O((nm)o(k)). [Moitra] EMNF admits an algorithm running in time O((nm)O(k )).
  • 67. n x m = xA W H n x k k x m non-negative non-negative non-negative Exact Non-negative Matrix Factorization For most applications, close approximation is good enough.
  • 68. n x m xA W H n x k k x m non-negative non-negative non-negative Non-negative Matrix Factorization For most applications, close approximation is good enough. ≈
  • 69. Notions of closeness Is f symmetric? Distance f: M x M —> R [ {1} Divergence
  • 70. Example: Distance Function Square of Euclidean distance: For matrices A and B (of same order) || A - B ||2 = (Aij - Bij)2∑ i,j || A - B ||2 = 0 if and only if A = B
  • 71. Example: Divergence Function For matrices A and B (of same order) D(A || B ) = (Aij log (Aij/Bij) - Aij + Bij)∑ i,j D(A || B ) = 0 if and only if A = B
  • 72. General Scheme of Algorithm: Non-negative Matrix Factorization Input: Output: A, W(0), H(0), and t=1. W and H.
  • 73. General Scheme of Algorithm: Non-negative Matrix Factorization 1. Fix H(t-1) and find W(t), such that D(A, W(t)H(t-1)) ≤ D(A, W(t-1)H(t-1)). 2. Fix W(t) and find H(t), such that D(A, W(t)H(t)) ≤ D(A, W(t)H(t-1)). 3. If convergence satisfied return W and H. 4. t=t+1. Input: Output: A, W(0), H(0), and t=1. W and H. While true
  • 74. Main Challenges in Designing Better NMF Algorithms Getting a good seeding for initialisation of W and H.
  • 75. Main Challenges in Designing Better NMF Algorithms Getting a good seeding for initialisation of W and H. Devising updating rules for W and H at subsequent iterations.
  • 76. Main Challenges in Designing Better NMF Algorithms Getting a good seeding for initialisation of W and H. Devising updating rules for W and H at subsequent iterations. Selecting distance/ divergence norms based on the application.
  • 77. Main Challenges in Designing Better NMF Algorithms Getting a good seeding for initialisation of W and H. Devising updating rules for W and H at subsequent iterations. Selecting distance/ divergence norms based on the application. Proving/ giving enough evidences for convergence of the algorithm.
  • 78. Why are non- negative matrices and non-negative factorisations important?
  • 79. Origin of Non-negative Matrix Factorization Evolved from Principal Component Analysis, which is used for dimension reduction. Disadvantage: Both positive and negative elements appear in principal components and coefficients in linear combinations. Hard to interpret results in applications like storing pixel brightness
  • 80. Applications Image Processing. The work due to Lee and Seung (1970) attracted lot of attention to NMF
  • 81. Applications Image Processing. Data represented as a non-negative matrix of pixels. NMF can find A W x H≈ W is the basis matrix, its column can be regarded as parts like nose, ear, eye, etc.
  • 82. Applications Image Processing. Data represented as a non-negative matrix of pixels. NMF can find A W x H≈ H encodes weights of each of basic parts of the face.
  • 83. Applications Image Processing. Data represented as a non-negative matrix of pixels. NMF can find A W x H≈ Therefore, we obtain a compressed from of data.
  • 84. Applications Clustering This is regarded as one of the most successful applications of NFM.
  • 85. Applications Clustering Data represented as a non-negative matrix of pixels. Columns of it are samples described by n features
  • 86. Applications Clustering Data represented as a non-negative matrix of pixels. NMF can find A W x H≈ W which is of order n x k, k, which denotes the number of clusters.
  • 87. Applications Clustering Data represented as a non-negative matrix of pixels. NMF can find A W x H≈ H is used as the cluster membership indicator matrix.
  • 88. Applications Clustering Data represented as a non-negative matrix of pixels. NMF can find A W x H≈ Sample i is in cluster j if Hji is the largest value in H;i.
  • 89. Applications Financial Data Mining The stock price fluctuations seem to be dominated by several underlying factors. NMF has been used to obtain underlying trends from the stock market data.