SlideShare a Scribd company logo
IBM Confidential
Self Organizing Map | October 26, 2004
Kohonen’s Self Organizing Maps | October 26, 2004
Kohonen’s Self Organizing Map
Self Organizing Maps
Mahendra Mani Ojha 01005024
Pranshu Sharma 01005026
Shivendra S. Meena 01005030
Under the Guidance of:
Prof. Pushpak Bhattacharya
Kohonen’s Self Organizing Map
Kohonen’s Self Organizing Map | October 26, 2004
Overview
 Terminology used
 Introduction of SOM
 Components of SOM
 Structure of the map
 Training algorithms of the map
 Advantages and disadvantages
 Proof of Convergence
 Applications
 Conclusion
 Reference
Kohonen’s Self Organizing Map
Kohonen’s Self Organizing Map | October 26, 2004
Terminology used
 Clustering
 Unsupervised learning
 Euclidean Distance






n
1
i
2
i
i
n
2
1
n
2
1
)
q
p
(
ED
)
q
,...,
q
,
q
(
q
)
p
,...,
p
,
p
(
p
Kohonen’s Self Organizing Map
Kohonen’s Self Organizing Map | October 26, 2004
Introduction of SOM
 Introduced by Prof. Teuvo Kohonen in 1982
 Also known as Kohonen feature map
 Unsupervised neural network
 Clustering tool of
high-dimensional
and complex data
Kohonen’s Self Organizing Map
Kohonen’s Self Organizing Map | October 26, 2004
Introduction of SOM contd…
 Maintains the topology of the dataset
 Training occurs via competition between the
neurons
 Impossible to assign network nodes to specific
input classes in advance
 Can be used for detecting similarity and degrees
of similarity
 It is assumed that input pattern fall into
sufficiently large distinct groupings
 Random weight vector initialization
Kohonen’s Self Organizing Map
Kohonen’s Self Organizing Map | October 26, 2004
Components of SOM
 Sample data
 Weights
 Output nodes
Kohonen’s Self Organizing Map
Kohonen’s Self Organizing Map | October 26, 2004
Structure of the map
 2-dimensional or 1-dimensional grid
 Each grid point represents a output node
 The grid is initialized with random vectors
Kohonen’s Self Organizing Map
Kohonen’s Self Organizing Map | October 26, 2004
Training Algorithm
 Initialize Map
 For t from 0 to 1
– Select a sample
– Get best matching unit
– Scale neighbors
– Increase t a small amount
End for
)
t
(
N
i
)]
t
(
m
)
t
(
x
)[
t
(
)
t
(
m
)
1
t
(
m
c
i
i
i







Kohonen’s Self Organizing Map
Kohonen’s Self Organizing Map | October 26, 2004
Kohonen’s Self Organizing Map
Kohonen’s Self Organizing Map | October 26, 2004
Kohonen’s Self Organizing Map
Kohonen’s Self Organizing Map | October 26, 2004
Initializing the weights
 SOMs are computationally very expensive.
 Good Initialization
– Less iterations
– Quality of Map
Kohonen’s Self Organizing Map
Kohonen’s Self Organizing Map | October 26, 2004
Get Best Matching Unit
 Any method for vector distance i. e.
–Nearest neighbor
–Farthest neighbor
–Distance between means
–Distance between medians
 Most common method is Euclidean distance.
 More than one contestant, choose randomly


n
0
i
2
i
x
Kohonen’s Self Organizing Map
Kohonen’s Self Organizing Map | October 26, 2004
Scale Neighbors
 Determining Neighbors
–Neighborhood size
Decreases over time
–Effect on neighbors
 Learning
vector
_
position
r
t
coefficien
learning
)
t
(
||)]
r
r
(||
x
)
3
/
2
(
exp[
)
t
(
i
m
i








)
t
(
m
)
1
t
(
m
,
otherwise
)]
t
(
m
)
t
(
x
)[
t
(
)
t
(
m
)
1
t
(
m
),
t
(
N
i
i
i
i
i
i
c









Kohonen’s Self Organizing Map
Kohonen’s Self Organizing Map | October 26, 2004
Necessary conditions
 Amount of training data
 Change of weights should be
– In excited neighborhood
– Proportional to activation received
Kohonen’s Self Organizing Map
Kohonen’s Self Organizing Map | October 26, 2004
 Advantages
– Very easy to understand
– Works well
 Disadvantages
– computationally expensive
– every SOM is different
Kohonen’s Self Organizing Map
Kohonen’s Self Organizing Map | October 26, 2004
Proof of convergence
 Complete proof only for one dimension.
– Very trivial
 Almost all partial proofs are based on
– Markov chains
 Difficulties :
– No definition for “A correctly ordered configuration”
– Proved result : It is not possible to associate a
“Global decreasing potential function” with this
algorithm.
Kohonen’s Self Organizing Map
Kohonen’s Self Organizing Map | October 26, 2004
WEBSOM
Kohonen’s Self Organizing Map
Kohonen’s Self Organizing Map | October 26, 2004
WebSOM (overview)
 Millions of Documents to be Searched
 Keywords or Key phrases used for searching
 DATA is clustered
– According to similarity
– Context
 It is kind of a Similarity graph of DATA
 For proper storage raw text documents must be
encoded for mapping.
Kohonen’s Self Organizing Map
Kohonen’s Self Organizing Map | October 26, 2004
Feature Vectors / Encoding
 Can simply be histograms of words of the
Document.
(Histogram may be the input vector but that makes
it a very large Input vector, so there is a need of
some kind of reduction)
 Reduction
– Reduction by random mapping
– Weighted word histogram (based on word frequency)
– By Latent Semantic Analysis
Kohonen’s Self Organizing Map
Kohonen’s Self Organizing Map | October 26, 2004
WebSOM
 Architecture
– Word category Map
– Document category Map
 Modes of Operation
– Supervised
(some information about the class is given, for e.g.
in the collection of Newsgroup articles maybe the
name of news group is supplied)
– Unsupervised
(no information provided)
Kohonen’s Self Organizing Map
Kohonen’s Self Organizing Map | October 26, 2004
Word Category Map
 Preprocessing
– Remove unimportant data (like images, signatures)
– Remove articles prepositions etc.
– Words occurring less than some fixed no. of times
are to be termed as don’t care !
– Replace synonymous words
Kohonen’s Self Organizing Map
Kohonen’s Self Organizing Map | October 26, 2004
Averaging Method
 Word code vector
–Each word represented by a unique vector (with dimension n ~ 100)
–Values may be random
 Context Vector
–For word at position i word vector is x(i)
where:
– E() = Estimate of expected value of x over text corpus
– ε = small scalar number
Kohonen’s Self Organizing Map
Kohonen’s Self Organizing Map | October 26, 2004
(contd.)
 Training: taking words
with different x(i)’s
 Input X(i)’s again .
 At the best matching node
write the corresponding
word .
 Similar context words
come at same node
Example
Kohonen’s Self Organizing Map
Kohonen’s Self Organizing Map | October 26, 2004
Document Category Map
 Encoded by mapping text word by word onto the
WCM.
 A histogram is formed based on the hits on
WCM.
 Use this histogram as fingerprint for DCM.
Kohonen’s Self Organizing Map
Kohonen’s Self Organizing Map | October 26, 2004
Summary:
Kohonen’s Self Organizing Map
Kohonen’s Self Organizing Map | October 26, 2004
Demo
Kohonen’s Self Organizing Map
Kohonen’s Self Organizing Map | October 26, 2004
References
[1] T.Honkela, S.Kaski, K.Lagus,T.Kohonen.
WEBSOM- Self Organizing Maps of Document Collection. (1997)
[2] T.Honkela, S.Kaski, K.Lagus,T.Kohonen.
Exploration of full text Databases with Self Organizing Maps. (1996)
[3] Teuvo Kohonen. Self-Organization of very large document
collections: State of the art (1998)
[4] http://websom.hut.fi/websom

More Related Content

PPT
Sefl Organizing Map
PPTX
PPTX
Kohonen Self Organizing Map,Kohonen Self Organizing Map
PPTX
Kohonen self organizing maps
PPTX
Self-organizing map
PDF
KCS-055 MLT U4.pdf
PDF
Van hulle springer:som
PPT
Neural networks Self Organizing Map by Engr. Edgar Carrillo II
Sefl Organizing Map
Kohonen Self Organizing Map,Kohonen Self Organizing Map
Kohonen self organizing maps
Self-organizing map
KCS-055 MLT U4.pdf
Van hulle springer:som
Neural networks Self Organizing Map by Engr. Edgar Carrillo II

Similar to Self-Organizing Maps (SOMs) an unsupervised machine learning algorithm (20)

PDF
Rauber, a. 1999: label_som_on the labeling of self-organising maps
PDF
International Journal of Engineering Research and Development (IJERD)
PDF
Self Organization Map
PDF
Self Organizing Maps Applications And Novel Algorithm Design Josphat Igadwa M...
PDF
SOFM based calssification for LU
PDF
Drobics, m. 2001: datamining using synergiesbetween self-organising maps and...
PDF
Self Organizing Feature Map(SOM), Topographic Product, Cascade 2 Algorithm
PDF
Juha vesanto esa alhoniemi 2000:clustering of the som
PPTX
Intrusion Detection Model using Self Organizing Maps.
PPTX
Competitive Learning [Deep Learning And Nueral Networks].pptx
PPTX
self operating maps
PDF
MATLAB IMPLEMENTATION OF SELF-ORGANIZING MAPS FOR CLUSTERING OF REMOTE SENSIN...
PDF
Self-Organising Maps for Customer Segmentation using R - Shane Lynn - Dublin R
PDF
Self Organizing Maps: Fundamentals
PPTX
Self Organizing Maps
PDF
Heterogeneous data fusion with multiple kernel growing self organizing maps
PPTX
Unsupervised learning
PPTX
Sess07 Clustering02_KohonenNet.pptx
PDF
Neural Network based Supervised Self Organizing Maps for Face Recognition
Rauber, a. 1999: label_som_on the labeling of self-organising maps
International Journal of Engineering Research and Development (IJERD)
Self Organization Map
Self Organizing Maps Applications And Novel Algorithm Design Josphat Igadwa M...
SOFM based calssification for LU
Drobics, m. 2001: datamining using synergiesbetween self-organising maps and...
Self Organizing Feature Map(SOM), Topographic Product, Cascade 2 Algorithm
Juha vesanto esa alhoniemi 2000:clustering of the som
Intrusion Detection Model using Self Organizing Maps.
Competitive Learning [Deep Learning And Nueral Networks].pptx
self operating maps
MATLAB IMPLEMENTATION OF SELF-ORGANIZING MAPS FOR CLUSTERING OF REMOTE SENSIN...
Self-Organising Maps for Customer Segmentation using R - Shane Lynn - Dublin R
Self Organizing Maps: Fundamentals
Self Organizing Maps
Heterogeneous data fusion with multiple kernel growing self organizing maps
Unsupervised learning
Sess07 Clustering02_KohonenNet.pptx
Neural Network based Supervised Self Organizing Maps for Face Recognition
Ad

Recently uploaded (20)

PDF
Fluorescence-microscope_Botany_detailed content
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PPTX
climate analysis of Dhaka ,Banglades.pptx
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PPT
Quality review (1)_presentation of this 21
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPTX
Supervised vs unsupervised machine learning algorithms
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PPTX
STERILIZATION AND DISINFECTION-1.ppthhhbx
PPTX
Qualitative Qantitative and Mixed Methods.pptx
PPTX
Introduction to Knowledge Engineering Part 1
PDF
Introduction to Data Science and Data Analysis
PDF
Business Analytics and business intelligence.pdf
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
Fluorescence-microscope_Botany_detailed content
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
climate analysis of Dhaka ,Banglades.pptx
STUDY DESIGN details- Lt Col Maksud (21).pptx
Quality review (1)_presentation of this 21
Acceptance and paychological effects of mandatory extra coach I classes.pptx
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Supervised vs unsupervised machine learning algorithms
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
IBA_Chapter_11_Slides_Final_Accessible.pptx
STERILIZATION AND DISINFECTION-1.ppthhhbx
Qualitative Qantitative and Mixed Methods.pptx
Introduction to Knowledge Engineering Part 1
Introduction to Data Science and Data Analysis
Business Analytics and business intelligence.pdf
Miokarditis (Inflamasi pada Otot Jantung)
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
Ad

Self-Organizing Maps (SOMs) an unsupervised machine learning algorithm

  • 1. IBM Confidential Self Organizing Map | October 26, 2004 Kohonen’s Self Organizing Maps | October 26, 2004 Kohonen’s Self Organizing Map Self Organizing Maps Mahendra Mani Ojha 01005024 Pranshu Sharma 01005026 Shivendra S. Meena 01005030 Under the Guidance of: Prof. Pushpak Bhattacharya
  • 2. Kohonen’s Self Organizing Map Kohonen’s Self Organizing Map | October 26, 2004 Overview  Terminology used  Introduction of SOM  Components of SOM  Structure of the map  Training algorithms of the map  Advantages and disadvantages  Proof of Convergence  Applications  Conclusion  Reference
  • 3. Kohonen’s Self Organizing Map Kohonen’s Self Organizing Map | October 26, 2004 Terminology used  Clustering  Unsupervised learning  Euclidean Distance       n 1 i 2 i i n 2 1 n 2 1 ) q p ( ED ) q ,..., q , q ( q ) p ,..., p , p ( p
  • 4. Kohonen’s Self Organizing Map Kohonen’s Self Organizing Map | October 26, 2004 Introduction of SOM  Introduced by Prof. Teuvo Kohonen in 1982  Also known as Kohonen feature map  Unsupervised neural network  Clustering tool of high-dimensional and complex data
  • 5. Kohonen’s Self Organizing Map Kohonen’s Self Organizing Map | October 26, 2004 Introduction of SOM contd…  Maintains the topology of the dataset  Training occurs via competition between the neurons  Impossible to assign network nodes to specific input classes in advance  Can be used for detecting similarity and degrees of similarity  It is assumed that input pattern fall into sufficiently large distinct groupings  Random weight vector initialization
  • 6. Kohonen’s Self Organizing Map Kohonen’s Self Organizing Map | October 26, 2004 Components of SOM  Sample data  Weights  Output nodes
  • 7. Kohonen’s Self Organizing Map Kohonen’s Self Organizing Map | October 26, 2004 Structure of the map  2-dimensional or 1-dimensional grid  Each grid point represents a output node  The grid is initialized with random vectors
  • 8. Kohonen’s Self Organizing Map Kohonen’s Self Organizing Map | October 26, 2004 Training Algorithm  Initialize Map  For t from 0 to 1 – Select a sample – Get best matching unit – Scale neighbors – Increase t a small amount End for ) t ( N i )] t ( m ) t ( x )[ t ( ) t ( m ) 1 t ( m c i i i       
  • 9. Kohonen’s Self Organizing Map Kohonen’s Self Organizing Map | October 26, 2004
  • 10. Kohonen’s Self Organizing Map Kohonen’s Self Organizing Map | October 26, 2004
  • 11. Kohonen’s Self Organizing Map Kohonen’s Self Organizing Map | October 26, 2004 Initializing the weights  SOMs are computationally very expensive.  Good Initialization – Less iterations – Quality of Map
  • 12. Kohonen’s Self Organizing Map Kohonen’s Self Organizing Map | October 26, 2004 Get Best Matching Unit  Any method for vector distance i. e. –Nearest neighbor –Farthest neighbor –Distance between means –Distance between medians  Most common method is Euclidean distance.  More than one contestant, choose randomly   n 0 i 2 i x
  • 13. Kohonen’s Self Organizing Map Kohonen’s Self Organizing Map | October 26, 2004 Scale Neighbors  Determining Neighbors –Neighborhood size Decreases over time –Effect on neighbors  Learning vector _ position r t coefficien learning ) t ( ||)] r r (|| x ) 3 / 2 ( exp[ ) t ( i m i         ) t ( m ) 1 t ( m , otherwise )] t ( m ) t ( x )[ t ( ) t ( m ) 1 t ( m ), t ( N i i i i i i c         
  • 14. Kohonen’s Self Organizing Map Kohonen’s Self Organizing Map | October 26, 2004 Necessary conditions  Amount of training data  Change of weights should be – In excited neighborhood – Proportional to activation received
  • 15. Kohonen’s Self Organizing Map Kohonen’s Self Organizing Map | October 26, 2004  Advantages – Very easy to understand – Works well  Disadvantages – computationally expensive – every SOM is different
  • 16. Kohonen’s Self Organizing Map Kohonen’s Self Organizing Map | October 26, 2004 Proof of convergence  Complete proof only for one dimension. – Very trivial  Almost all partial proofs are based on – Markov chains  Difficulties : – No definition for “A correctly ordered configuration” – Proved result : It is not possible to associate a “Global decreasing potential function” with this algorithm.
  • 17. Kohonen’s Self Organizing Map Kohonen’s Self Organizing Map | October 26, 2004 WEBSOM
  • 18. Kohonen’s Self Organizing Map Kohonen’s Self Organizing Map | October 26, 2004 WebSOM (overview)  Millions of Documents to be Searched  Keywords or Key phrases used for searching  DATA is clustered – According to similarity – Context  It is kind of a Similarity graph of DATA  For proper storage raw text documents must be encoded for mapping.
  • 19. Kohonen’s Self Organizing Map Kohonen’s Self Organizing Map | October 26, 2004 Feature Vectors / Encoding  Can simply be histograms of words of the Document. (Histogram may be the input vector but that makes it a very large Input vector, so there is a need of some kind of reduction)  Reduction – Reduction by random mapping – Weighted word histogram (based on word frequency) – By Latent Semantic Analysis
  • 20. Kohonen’s Self Organizing Map Kohonen’s Self Organizing Map | October 26, 2004 WebSOM  Architecture – Word category Map – Document category Map  Modes of Operation – Supervised (some information about the class is given, for e.g. in the collection of Newsgroup articles maybe the name of news group is supplied) – Unsupervised (no information provided)
  • 21. Kohonen’s Self Organizing Map Kohonen’s Self Organizing Map | October 26, 2004 Word Category Map  Preprocessing – Remove unimportant data (like images, signatures) – Remove articles prepositions etc. – Words occurring less than some fixed no. of times are to be termed as don’t care ! – Replace synonymous words
  • 22. Kohonen’s Self Organizing Map Kohonen’s Self Organizing Map | October 26, 2004 Averaging Method  Word code vector –Each word represented by a unique vector (with dimension n ~ 100) –Values may be random  Context Vector –For word at position i word vector is x(i) where: – E() = Estimate of expected value of x over text corpus – ε = small scalar number
  • 23. Kohonen’s Self Organizing Map Kohonen’s Self Organizing Map | October 26, 2004 (contd.)  Training: taking words with different x(i)’s  Input X(i)’s again .  At the best matching node write the corresponding word .  Similar context words come at same node Example
  • 24. Kohonen’s Self Organizing Map Kohonen’s Self Organizing Map | October 26, 2004 Document Category Map  Encoded by mapping text word by word onto the WCM.  A histogram is formed based on the hits on WCM.  Use this histogram as fingerprint for DCM.
  • 25. Kohonen’s Self Organizing Map Kohonen’s Self Organizing Map | October 26, 2004 Summary:
  • 26. Kohonen’s Self Organizing Map Kohonen’s Self Organizing Map | October 26, 2004 Demo
  • 27. Kohonen’s Self Organizing Map Kohonen’s Self Organizing Map | October 26, 2004 References [1] T.Honkela, S.Kaski, K.Lagus,T.Kohonen. WEBSOM- Self Organizing Maps of Document Collection. (1997) [2] T.Honkela, S.Kaski, K.Lagus,T.Kohonen. Exploration of full text Databases with Self Organizing Maps. (1996) [3] Teuvo Kohonen. Self-Organization of very large document collections: State of the art (1998) [4] http://websom.hut.fi/websom