SlideShare a Scribd company logo
GENE EXPRESSION
CLUSTERING
GRAPH BASED APPROACHES
                             A   P R E S E N T A T I O N   B Y   GOVIND M (M120432CS)
                         MTECH COMPUTER SCIENCE AND ENGINEERING
                         N AT I O N A L I N S T I T U T E O F T E C H N O L O G Y C A L I C U T
                                                           govindmaheswaran@gmail.com
Clustering and Graph Theory


      Using Graphs in
      Clustering

        Simple Graph Partitioning   Outline

      Spectral Graph Partitioning


Conclusion
Clustering
• Process of Grouping a set of data objects, in terms of similarity
• Same Cluster => Similar Objects and vice versa.
• Widely used in data mining, market analysis etc.
• Used to make sense of Bioinformatics data.
• Two major purposes, in Bioinformatics
    • Find properties of genes ( Relationship among genes, deduce the functions of genes etc)
    • Predict more relevant factors (eg. Clustering cancerous and non cancerous
      genes, finding the effect of a medication)
Graphs
• Data Structure
• Used in multiple domains
• Key Terms
   • Edge
   • Vertex
   • Weighted Graph
Some Graph Theory


                • Cut



                • Partitioning
Clustering using Graphs
 Involves 3 steps
1.   Preprocessing
     ◦   Convert data set into a graph
     ◦   Using Adjacency matrix and Degree Matrix representation
     ◦   Similarity between nodes can be taken as the weight of an edge.

2.   Partitioning
     ◦   Partition the graph


3.   Clustering
     ◦   Repeat until required number of clusters are obtained
     ◦   Alternatively, extra iterations followed by joinings may also be implemented.
Simple Graph Partitioning
• Weight of an edge = Similarity between the nodes
• Find Minimum Cut
• Edge Value decreases, cluster differs
Simple Graph Partitioning : The
Algorithm
Input : Graph G<V,E>, Number of Clusters k
Output: Cluster of Graphs


Repeat k-1 times
     Low_val = infinity
     For each edge e of the graph
           Calculate Cut_Cost, cost of a CUT at that edge
           if Cut_Cost < Low_val
                 Low_Val = cut_cost
                 Cut_Edge = e
     Cut at edge e
Simple Graph Partitioning                    (cont..)

• Advantage
  • Simple to implement
  • Uses the concept of Min Cut.
• Disadvantage
  • What about intra-cluster similarity..?
Spectral Graph Partitioning
• Is widely used
• Uses Eigen Vectors of Laplacian Matrix
• Recursive algorithm
• Qualitatively Good
• Computationally Better than SGP.
Some graph theory…
                                    d1 = 7
        • Degree :                  d2 = 3
                                    d3 = 1
                                    d4 = 0


                               0     2   5   0
        • Affinity Matrix :    0     0   3   0
                               0     0   0   1
                               0     0   0   0

                               7     0   0   0
                               0     3   0   0
        • Degree Matrix        0     0   1   0
                               0     0   0   0


                               -7    2 5 0
                                0   -3 3 0
        • Laplacian Matrix :    0    0 -1 1
                                0    0 0 0
Some more Graph Theory…
• Spectrum : Eigen vectors, arranged in the order of magnitude of eigen values.
• Eigen Values of Graphs
   •   Calculated as Eigen values of Laplacian matrix of the graph
   •   Corresponidngly Eigen Vectors too


• Fiedler Theorm
   •   Correlation b/w eigen vectors and graph properties
   •   Principal Eigen Vectors. Kth Principal Eigen Vector.
   •   Principal Eigen Vector : Centrality of Vertices


• 2nd Principal Eigen Vector : algebraic connectivity
   •   Called Fiedler Vector
   •   Matrix of positive and negative values
   •   Partition is decided by the Sign of the value.
Spectral Graph Partitioning
Input : Graph G<V,E>
Output: Graphs G1< V1,E1>, G2< V2,E2>

 Create the Laplacian Vector L, of the Graph G.
 Calculate the Fiedler Vector F
 for each vertex vi in G
    if F[i]>0
          V1.append(v)
    else
          V2.append(v)
SPG : Example
           2nd Principal Vector = <0.415, 0.309, 0.069, −0.221, 0.221, −0.794>




          2nd Principal Vector = <0.415, 0.309, -0.190, 0.169, >
              (of 1235)
SGP : Bipartitioning Method
       (contd.)

• Recursive Algorithm
• Although better than Simple Graph Partitioning, not optimum
• Multiple times bipartitioning.


• Can be improved by Multipartitioning
• Use more eigen vectors.
Conclusion
• Clustering is Based on simple concepts of graph theory
• Optimal results (Spectral methods)
• Can give better performance than traditional clustering.
• Preprocessing overhead.
References
1.   Yanhua Chen; Ming Dong; Rege, M., "Gene Expression Clustering: a Novel Graph Partitioning
     Approach," Neural Networks, 2007. IJCNN 2007. International Joint Conference on
     , vol., no., pp.1542,1547, 12-17 Aug. 2007, doi: 10.1109/IJCNN.2007.4371187
     URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4371187&isnumber=4370
     891
2.   Hagen, L.; Kahng, A.B., "New spectral methods for ratio cut partitioning and clustering,"
     Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on
     , vol.11, no.9, pp.1074,1085, Sep 1992, doi: 10.1109/43.159993
     URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=159993&isnumber=4190
3.   Donath, W.E.; Hoffman, A.J., "Lower Bounds for the Partitioning of Graphs," IBM Journal of
     Research and Development, vol. 17, pp. 420-425, 1973.
4.   Pavla Kabel´ıková , “Graph Partitioning Using Spectral Methods”, Thesis, VˇSB - Technical
     University of Ostrava, 2006.
5.   Chung, F.R.K., "Spectral Graph Theory," American Mathematical Society, 1997.

More Related Content

PDF
08 distributed optimization
PDF
07 dimensionality reduction
PDF
Pca ankita dubey
PDF
Scaling Transform Methods For Compressing a 2D Graphical image
PPTX
A Fast Content-Based Image Retrieval Method Using Deep Visual Features
PPTX
"FingerPrint Recognition Using Principle Component Analysis(PCA)”
PPTX
Ability Study of Proximity Measure for Big Data Mining Context on Clustering
PPTX
Principal component analysis
08 distributed optimization
07 dimensionality reduction
Pca ankita dubey
Scaling Transform Methods For Compressing a 2D Graphical image
A Fast Content-Based Image Retrieval Method Using Deep Visual Features
"FingerPrint Recognition Using Principle Component Analysis(PCA)”
Ability Study of Proximity Measure for Big Data Mining Context on Clustering
Principal component analysis

What's hot (20)

PDF
Principal component analysis and lda
PPT
Understandig PCA and LDA
PPSX
PDF
Dimensionality Reduction
PPTX
K-means Clustering
PPTX
Introduction to Linear Discriminant Analysis
PPT
PCA-SIFT: A More Distinctive Representation for Local Image Descriptors
PPTX
Types of clustering and different types of clustering algorithms
PDF
Image segmentation using normalized graph cut
PPTX
Matrix decomposition and_applications_to_nlp
PDF
K means Clustering
PPTX
"Principal Component Analysis - the original paper" presentation @ Papers We ...
PPTX
PPTX
Kmeans
PDF
Cluster Analysis for Dummies
PDF
New Approach for K-mean and K-medoids Algorithm
PPT
Facial keypoint recognition
PDF
A Correlative Information-Theoretic Measure for Image Similarity
Principal component analysis and lda
Understandig PCA and LDA
Dimensionality Reduction
K-means Clustering
Introduction to Linear Discriminant Analysis
PCA-SIFT: A More Distinctive Representation for Local Image Descriptors
Types of clustering and different types of clustering algorithms
Image segmentation using normalized graph cut
Matrix decomposition and_applications_to_nlp
K means Clustering
"Principal Component Analysis - the original paper" presentation @ Papers We ...
Kmeans
Cluster Analysis for Dummies
New Approach for K-mean and K-medoids Algorithm
Facial keypoint recognition
A Correlative Information-Theoretic Measure for Image Similarity
Ad

Similar to Graph based approaches to Gene Expression Clustering (20)

PPT
PDF
Use of eigenvalues and eigenvectors to analyze bipartivity of network graphs
PDF
An Introduction to Spectral Graph Theory
PDF
GraphSignalProcessingFinalPaper
PPTX
Spectral clustering Tutorial
PDF
Cdt guest lecture_gsp
PDF
Graph Partitioning and Spectral Methods
PDF
Spectral Clustering Report
PDF
Numerical Linear Algebra for Data and Link Analysis.
PPTX
Graph theory ppt.pptx
PDF
IJCAI13 Paper review: Large-scale spectral clustering on graphs
PPTX
Everything About Graphs in Data Structures.pptx
PDF
icml2004 tutorial on spectral clustering part I
PDF
Spectral clustering with motifs and higher-order structures
PPTX
Spectral graph theory
PDF
Graph Signal Processing for Machine Learning A Review and New Perspectives - ...
PPTX
Graph theory
PPTX
Matrix representation of graph
PPT
Tn 110 lecture 8
PPTX
Graph_Theory_and_Applications_Presentation.pptx
Use of eigenvalues and eigenvectors to analyze bipartivity of network graphs
An Introduction to Spectral Graph Theory
GraphSignalProcessingFinalPaper
Spectral clustering Tutorial
Cdt guest lecture_gsp
Graph Partitioning and Spectral Methods
Spectral Clustering Report
Numerical Linear Algebra for Data and Link Analysis.
Graph theory ppt.pptx
IJCAI13 Paper review: Large-scale spectral clustering on graphs
Everything About Graphs in Data Structures.pptx
icml2004 tutorial on spectral clustering part I
Spectral clustering with motifs and higher-order structures
Spectral graph theory
Graph Signal Processing for Machine Learning A Review and New Perspectives - ...
Graph theory
Matrix representation of graph
Tn 110 lecture 8
Graph_Theory_and_Applications_Presentation.pptx
Ad

Recently uploaded (20)

PDF
Complications of Minimal Access Surgery at WLH
PDF
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
PDF
Chinmaya Tiranga quiz Grand Finale.pdf
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PPTX
Institutional Correction lecture only . . .
PDF
Anesthesia in Laparoscopic Surgery in India
PDF
O7-L3 Supply Chain Operations - ICLT Program
PPTX
Introduction-to-Literarature-and-Literary-Studies-week-Prelim-coverage.pptx
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PPTX
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PPTX
Lesson notes of climatology university.
PPTX
Cell Structure & Organelles in detailed.
PDF
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
PPTX
master seminar digital applications in india
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
Complications of Minimal Access Surgery at WLH
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
Chinmaya Tiranga quiz Grand Finale.pdf
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
Institutional Correction lecture only . . .
Anesthesia in Laparoscopic Surgery in India
O7-L3 Supply Chain Operations - ICLT Program
Introduction-to-Literarature-and-Literary-Studies-week-Prelim-coverage.pptx
Final Presentation General Medicine 03-08-2024.pptx
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
Lesson notes of climatology university.
Cell Structure & Organelles in detailed.
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
master seminar digital applications in india
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
STATICS OF THE RIGID BODIES Hibbelers.pdf
FourierSeries-QuestionsWithAnswers(Part-A).pdf

Graph based approaches to Gene Expression Clustering

  • 1. GENE EXPRESSION CLUSTERING GRAPH BASED APPROACHES A P R E S E N T A T I O N B Y GOVIND M (M120432CS) MTECH COMPUTER SCIENCE AND ENGINEERING N AT I O N A L I N S T I T U T E O F T E C H N O L O G Y C A L I C U T govindmaheswaran@gmail.com
  • 2. Clustering and Graph Theory Using Graphs in Clustering Simple Graph Partitioning Outline Spectral Graph Partitioning Conclusion
  • 3. Clustering • Process of Grouping a set of data objects, in terms of similarity • Same Cluster => Similar Objects and vice versa. • Widely used in data mining, market analysis etc. • Used to make sense of Bioinformatics data. • Two major purposes, in Bioinformatics • Find properties of genes ( Relationship among genes, deduce the functions of genes etc) • Predict more relevant factors (eg. Clustering cancerous and non cancerous genes, finding the effect of a medication)
  • 4. Graphs • Data Structure • Used in multiple domains • Key Terms • Edge • Vertex • Weighted Graph
  • 5. Some Graph Theory • Cut • Partitioning
  • 6. Clustering using Graphs Involves 3 steps 1. Preprocessing ◦ Convert data set into a graph ◦ Using Adjacency matrix and Degree Matrix representation ◦ Similarity between nodes can be taken as the weight of an edge. 2. Partitioning ◦ Partition the graph 3. Clustering ◦ Repeat until required number of clusters are obtained ◦ Alternatively, extra iterations followed by joinings may also be implemented.
  • 7. Simple Graph Partitioning • Weight of an edge = Similarity between the nodes • Find Minimum Cut • Edge Value decreases, cluster differs
  • 8. Simple Graph Partitioning : The Algorithm Input : Graph G<V,E>, Number of Clusters k Output: Cluster of Graphs Repeat k-1 times Low_val = infinity For each edge e of the graph Calculate Cut_Cost, cost of a CUT at that edge if Cut_Cost < Low_val Low_Val = cut_cost Cut_Edge = e Cut at edge e
  • 9. Simple Graph Partitioning (cont..) • Advantage • Simple to implement • Uses the concept of Min Cut. • Disadvantage • What about intra-cluster similarity..?
  • 10. Spectral Graph Partitioning • Is widely used • Uses Eigen Vectors of Laplacian Matrix • Recursive algorithm • Qualitatively Good • Computationally Better than SGP.
  • 11. Some graph theory… d1 = 7 • Degree : d2 = 3 d3 = 1 d4 = 0 0 2 5 0 • Affinity Matrix : 0 0 3 0 0 0 0 1 0 0 0 0 7 0 0 0 0 3 0 0 • Degree Matrix 0 0 1 0 0 0 0 0 -7 2 5 0 0 -3 3 0 • Laplacian Matrix : 0 0 -1 1 0 0 0 0
  • 12. Some more Graph Theory… • Spectrum : Eigen vectors, arranged in the order of magnitude of eigen values. • Eigen Values of Graphs • Calculated as Eigen values of Laplacian matrix of the graph • Corresponidngly Eigen Vectors too • Fiedler Theorm • Correlation b/w eigen vectors and graph properties • Principal Eigen Vectors. Kth Principal Eigen Vector. • Principal Eigen Vector : Centrality of Vertices • 2nd Principal Eigen Vector : algebraic connectivity • Called Fiedler Vector • Matrix of positive and negative values • Partition is decided by the Sign of the value.
  • 13. Spectral Graph Partitioning Input : Graph G<V,E> Output: Graphs G1< V1,E1>, G2< V2,E2> Create the Laplacian Vector L, of the Graph G. Calculate the Fiedler Vector F for each vertex vi in G if F[i]>0 V1.append(v) else V2.append(v)
  • 14. SPG : Example 2nd Principal Vector = <0.415, 0.309, 0.069, −0.221, 0.221, −0.794> 2nd Principal Vector = <0.415, 0.309, -0.190, 0.169, > (of 1235)
  • 15. SGP : Bipartitioning Method (contd.) • Recursive Algorithm • Although better than Simple Graph Partitioning, not optimum • Multiple times bipartitioning. • Can be improved by Multipartitioning • Use more eigen vectors.
  • 16. Conclusion • Clustering is Based on simple concepts of graph theory • Optimal results (Spectral methods) • Can give better performance than traditional clustering. • Preprocessing overhead.
  • 17. References 1. Yanhua Chen; Ming Dong; Rege, M., "Gene Expression Clustering: a Novel Graph Partitioning Approach," Neural Networks, 2007. IJCNN 2007. International Joint Conference on , vol., no., pp.1542,1547, 12-17 Aug. 2007, doi: 10.1109/IJCNN.2007.4371187 URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4371187&isnumber=4370 891 2. Hagen, L.; Kahng, A.B., "New spectral methods for ratio cut partitioning and clustering," Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on , vol.11, no.9, pp.1074,1085, Sep 1992, doi: 10.1109/43.159993 URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=159993&isnumber=4190 3. Donath, W.E.; Hoffman, A.J., "Lower Bounds for the Partitioning of Graphs," IBM Journal of Research and Development, vol. 17, pp. 420-425, 1973. 4. Pavla Kabel´ıková , “Graph Partitioning Using Spectral Methods”, Thesis, VˇSB - Technical University of Ostrava, 2006. 5. Chung, F.R.K., "Spectral Graph Theory," American Mathematical Society, 1997.

Editor's Notes

  • #13: Centrality : Influence