SlideShare a Scribd company logo
Tweet Along @dgleich


                                                                                                                              Spectra
                                                                                                                                   of
                                                                                                                                Large
                                                                                                                             Networks

                                                                                                     David F. Gleich
                                                                                        Sandia National Laboratories

                                                                                                        iCME la/opt seminar
                                                                                                          24 February 2011

Thanks to Ali Pinar, Jaideep Ray, Tammy                     Supported by Sandia’s John von Neumann postdoctoral fellowship
Kolda, C. Seshadhri, Rich Lehoucq, and                                and the DOE Office of Science’s ASCR Graphs project.
Jure Leskovec for helpful discussions.

Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin
                       Corporation, for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-AC04-94AL85000.
2/24/2011                                                  Stanford ICME seminar                                                           1/36
Tweet Along @dgleich



There’s information inside the spectra




      Words in dictionary definitions                      Internet router network
      111,000 vertices, 2.7M edges                        192k vetices, 1.2M edges
                                         Also noted in a study of smaller graphs by Banerjee and Jost (2009)
2/24/2011                         Stanford ICME seminar                                                2/36
Tweet Along @dgleich



Overview
Graphs and their matrices

Data for our experiments

Computing spectra for large networks

Issues with computing spectra

Many examples of graph spectra

Conclusion Future work
              Images taken from Stanford, flickr, and Purdue, respectively
2/24/2011                            Stanford ICME seminar                                  3/36
Tweet Along @dgleich




              GRAPHS            As well as things we
            AND THEIR           already know about
                                graph spectra.
             MATRICES




2/24/2011        Stanford ICME seminar                                  4
Tweet Along @dgleich



Matrices from graphs
Adjacency matrix                                Not covered
                                                Signless Laplacian matrix

       if                                       Incidence matrix
                                                 (It is incidentally discussed)

                                                Seidel matrix
Laplacian matrix
                                                Random walk matrix
                                                 
 
                                                Modularity matrix
                                                 
Normalized Laplacian matrix                      
 


                                                                     Everything is undirected.
2/24/2011               Stanford ICME seminar                                            5/36
Tweet Along @dgleich



Why are we interested in the spectra?
Modeling

Properties
  Moments of the adjacency

Anomalies

Regularities

Network Comparison
  Fay et al. 2010 – Weighted Spectral Density
  The network is as19971108 from Jure’s snap collect (a few thousand nodes) and we insert random connections from 50 nodes
2/24/2011                                         Stanford ICME seminar                                               6/36
Tweet Along @dgleich



Spectral bounds from Gerschgorin

 

 

 
    (from a slightly different approach)




2/24/2011                   Stanford ICME seminar                  7/36
Tweet Along @dgleich



Semi-circle law
Wigner’s semi-circle law
 Eigenvalues of a random symmetric matrix where
 each entry is independent have a semi-circle density




Erdős–Rényi graphs obey a special case
                                                  Image from Mathworld
2/24/2011               Stanford ICME seminar                     8/36
Tweet Along @dgleich



Erdős–Rényi Semi-circles
The eigenvalues of the adjacency matrix for
  n=1000, averaged over 10 trials
Semi-circle with outlier if average degree is large enough.




                   Observed by Farkas and in the book “Network Alignment” edited by Brandes (Chapter 14)
2/24/2011                       Stanford ICME seminar                                               9/36
Tweet Along @dgleich



Previous results
Farkas et al. : Significant deviation from the
  semi-circle law for the adjacency matrix

Mihail and Papadimitriou : Leading eigenvalues
  of the adjacency matrix obey a power-law
  based on the degree-sequence

Chung et al. : Normalized Laplacian still obeys
  a semi-circle law

Banerjee and Jost : Study of types of patterns
  that emerge in evolving graph models –
  explain many features of the spectra
2/24/2011                  Stanford ICME seminar                 10/36
Tweet Along @dgleich



In comparison …
We use “exact” computation of spectra,
 instead of approximation.

We study “all” of the standard matrices
 over a range of large networks.

Our “large” is bigger.

We look at a few different random graph
 models.



2/24/2011                Stanford ICME seminar                 11/36
Tweet Along @dgleich




            DATA




2/24/2011   Stanford ICME seminar                   12
Tweet Along @dgleich



Data sources
 SNAP               Various                       100s-100,000s
 SNAP-p2p           Gnutella Network              5-60k, ~30 inst.
 SNAP-as-733        Autonomous Sys.               ~5,000, 733 inst.
 SNAP-caida         Router networks               ~20,000, ~125 inst.
 Pajek              Various                       100s-100,000s
 Models             Copying Model                 1k-100k 9 inst. 324 gs
                    Pref. Attach                  1k-100k 9 inst. 164 gs
                    Forest Fire                   1k-100k 9 inst. 324 gs
 Mine               Various                       2k-500k
 Newman             Various
 Arenas             Various
 Porter             Facebook                      100 schools, 5k-60k
 IsoRank, Natalie   Protein-Protein               <10k , 4 graphs
                                                     Thanks to all who make data available
2/24/2011                 Stanford ICME seminar                                      13/36
Tweet Along @dgleich



Big graphs
Arxiv             86376               1035126      Co-authorship
Dblp              9356                356290       Co-authorship
Dictionary(*)     111982              2750576      Word definitions
Internet(*)       124651              414428       Routers
Itdk0304          190914              1215220      Routers
p2p-Gnutella(*)   62561               295756       Peer-to-peer
Patents(*)        230686              1109898      Citations
Roads             126146              323900       Roads
Wordnet(*)        75606               240036       Word relationship
Web-NotreDame     325729              2994268      Web




2/24/2011                  Stanford ICME seminar                   14/36
Tweet Along @dgleich



Models
Preferential Attachment
Start graph with a k-node clique. Add a new node and
  connect to k random nodes, chosen proportional to degree.

Copying model
Start graph with a k-node clique. Add a new node and pick a
  parent uniformly at random. Copy edges of parent and
  make an error with probability  

Forest Fire
Start graph with a k-node clique. Add a new node and pick a
  parent uniformly at random. Do a random “bfs’/”forest fire”
  and link to all nodes “burned”
2/24/2011                Stanford ICME seminar                 15/36
Tweet Along @dgleich




                COMPUTING
                 SPECTRA OF
            LARGE NETWORKS




2/24/2011              Stanford ICME seminar                   16
Tweet Along @dgleich



Matlab!
Always a great starting point.
  My desktop has 24GB of RAM (less than $2500 now!)

24GB/8 bytes (per double) = 3 billion numbers
  ~ 50,000-by-50,000 matrix

Possibilities
  D = eig(A) – needs twice the memory for A,D
  [V,D] = eig(A) – needs three times the memory for A,D,V

These limit us to ~38000 and ~31000 respectively.


2/24/2011               Stanford ICME seminar                 17/36
Tweet Along @dgleich



ScaLAPACK
LAPACK with distributed memory dense matrices
Scalapack uses a 2d block-cyclic dense matrix distribution




2/24/2011                Stanford ICME seminar                 18/36
Tweet Along @dgleich



Eigenvalues with ScaLAPACK
Mostly the same approach as in LAPACK

1. Reduce to tridiagonal form
   (most time consuming part)
2. Distribute tridiagonals to
    all processors
3. Each processor finds
   all eigenvalues
4. Each processor computes a
   subset of eigenvectors

Using the MRRR algorithm steps 3 and 4 are more intricate
and faster.            MRRR due to Parlett and Dhillon; implemented in ScaLAPACK by Christof Vomel.
2/24/2011                                Stanford ICME seminar                                 19/36
Tweet Along @dgleich



Alternatives
Use ARPACK to get extrema

Use ARPACK to get interior around                via the folded spectrum
 




                                                         Farkas et al. used this approach.
2/24/2011                Stanford ICME seminar                                      20/36
Tweet Along @dgleich




            ISSUES WITH
            COMPUTING
                SPECTRA




2/24/2011          Stanford ICME seminar                   21
Tweet Along @dgleich



Bugs - Matlab


eig(A)
Returns incorrect eigenvectors




                                                 Seems to be the result of a bug in Intel’s MKL library.
2/24/2011                Stanford ICME seminar                                                    22/36
Tweet Along @dgleich



Bug – ScaLAPACK default

sudo apt-get install scalapack-openmpi
Allocate 36000x36000 local matrix
Run on 4 processors

Code crashes




2/24/2011               Stanford ICME seminar                 23/36
Tweet Along @dgleich



Bug - LAPACK

Scalapack MRRR
Compare standard lapack/blas to atlas performance
Result: correct output from atlas
Result: incorrect output from lapack
Hypothesis: lapack contains a known bug that’s apparently in
  the default ubuntu lapack




2/24/2011                Stanford ICME seminar                 24/36
Tweet Along @dgleich



Moral


Always test your software.
Extensively.




2/24/2011                Stanford ICME seminar                 25/36
Tweet Along @dgleich



(Super)-Computers
Redsky
  2x Intel Nehalem 2.9 GHz/8 core
  12 GB/node
  22TB memory total
  used up to 500 nodes/
      6 TB memory

Hopper (I)
  2x AMD quad-core
      2.4 GHz/8 cores
  16 GB/node

Cielo (testbed) 20 nodes
2/24/2011                  Stanford ICME seminar                 26/36
Tweet Along @dgleich



Adding MPI tasks vs. using threads
Most math libraries have threaded versions
   (Intel MKL, AMD ACML)
Is it better to use threads or MPI tasks?

It depends.



                                              Threads          Ranks             Time
Threads Ranks    Time-T   Time-E
                                              1                64                1412.5
1           36   1271.4   339.0
                                              4                16                1881.4
4           16   1058.1   456.6
                                              16               4                 Omitted.

                                  Normalized Laplacian for 36k-by-36k co-author graph of CondMat
2/24/2011                 Stanford ICME seminar                                            27/36
Tweet Along @dgleich



    Strong Parallel Scaling
                                             Time  
Time in hours




                                             Good strong
                                               scaling up to
                                               325,000 vertices

                                             Estimated time for
                                               500,000 nodes
                                               9 hours with
                                               925 nodes
                                               (7400 procs)
                  
    2/24/2011        Stanford ICME seminar                          28/36
Tweet Along @dgleich




            EXAMPLES




2/24/2011       Stanford ICME seminar                   29
Tweet Along @dgleich



A $40,000 matrix computation




2/24/2011     Stanford ICME seminar                 30/36
Tweet Along @dgleich



Spikes?




2/24/2011   Stanford ICME seminar                 31/36
Tweet Along @dgleich



Nullspaces of the adjacency matrix
 

So unit eigenvalues of the normalized Laplacian are null-
  vectors of the adjacency matrix.




2/24/2011                Stanford ICME seminar                 32/36
Tweet Along @dgleich



Stanford’s Facebook Network




                                       Data from Mason Porter
2/24/2011     Stanford ICME seminar                     33/36
Tweet Along @dgleich



Movies of spectra…
As these models evolve, what do the spectra look like?




2/24/2011                Stanford ICME seminar                 34/36
Tweet Along @dgleich




2/24/2011   Stanford ICME seminar                 35/36
Code will be available eventually. Image from good financial cents.
2/24/2011                            Stanford ICME seminar                        36 of <Total>

More Related Content

PDF
Data-intensive profile for the VAMDC
PDF
Deep learning for detecting anomalies and software vulnerabilities
PPTX
VLSI IN NEURAL NETWORKS
PDF
Fast pair-wise and node-wise algorithms for commute times and Katz scores
PDF
Two numerical graph algorithms
PDF
Fast Katz and Commuters
PDF
Relaxation methods for the matrix exponential on large networks
PDF
Fast matrix primitives for ranking, link-prediction and more
Data-intensive profile for the VAMDC
Deep learning for detecting anomalies and software vulnerabilities
VLSI IN NEURAL NETWORKS
Fast pair-wise and node-wise algorithms for commute times and Katz scores
Two numerical graph algorithms
Fast Katz and Commuters
Relaxation methods for the matrix exponential on large networks
Fast matrix primitives for ranking, link-prediction and more

Similar to Spectra of Large Network (20)

PDF
The Spectre of the Spectra
PDF
The spectre of the spectrum
PDF
Emc 2013 Big Data in Astronomy
PDF
A Dynamic Topic Model of Learning Analytics Research
PDF
TensorFlow London: Cutting edge generative models
PDF
Deep learning 1.0 and Beyond, Part 1
PDF
Coupling Australia’s Researchers to the Global Innovation Economy
PDF
Simulation Informatics
PPT
An End-to-End Campus-Scale High Performance Cyberinfrastructure for Data-Inte...
PPTX
Middleware Solutions for Simulation & Modeling
PDF
Neural Networks, Spark MLlib, Deep Learning
PDF
STING: Spatio-Temporal Interaction Networks and Graphs for Intel Platforms
PPT
The Future of the Internet and its Impact on Digitally Enabled Genomic Medicine
PPT
Victoria A. White Head, Computing Division Fermilab
PDF
A Pocket Dictionary of Tomorrow’s Electronics_Franz_IPC-TLP2021.pdf
PPTX
Cloud Programming Models: eScience, Big Data, etc.
PPTX
Sgg crest-presentation-final
PDF
Semi-Supervised Classification with Graph Convolutional Networks @ICLR2017読み会
PDF
論文サーベイ(Sasaki)
The Spectre of the Spectra
The spectre of the spectrum
Emc 2013 Big Data in Astronomy
A Dynamic Topic Model of Learning Analytics Research
TensorFlow London: Cutting edge generative models
Deep learning 1.0 and Beyond, Part 1
Coupling Australia’s Researchers to the Global Innovation Economy
Simulation Informatics
An End-to-End Campus-Scale High Performance Cyberinfrastructure for Data-Inte...
Middleware Solutions for Simulation & Modeling
Neural Networks, Spark MLlib, Deep Learning
STING: Spatio-Temporal Interaction Networks and Graphs for Intel Platforms
The Future of the Internet and its Impact on Digitally Enabled Genomic Medicine
Victoria A. White Head, Computing Division Fermilab
A Pocket Dictionary of Tomorrow’s Electronics_Franz_IPC-TLP2021.pdf
Cloud Programming Models: eScience, Big Data, etc.
Sgg crest-presentation-final
Semi-Supervised Classification with Graph Convolutional Networks @ICLR2017読み会
論文サーベイ(Sasaki)
Ad

More from David Gleich (20)

PDF
Engineering Data Science Objectives for Social Network Analysis
PDF
Correlation clustering and community detection in graphs and networks
PDF
Spectral clustering with motifs and higher-order structures
PDF
Higher-order organization of complex networks
PDF
Spacey random walks and higher-order data analysis
PDF
Non-exhaustive, Overlapping K-means
PDF
Using Local Spectral Methods to Robustify Graph-Based Learning
PDF
Spacey random walks and higher order Markov chains
PDF
Localized methods in graph mining
PDF
PageRank Centrality of dynamic graph structures
PDF
Iterative methods with special structures
PDF
Big data matrix factorizations and Overlapping community detection in graphs
PDF
Anti-differentiating approximation algorithms: A case study with min-cuts, sp...
PDF
Localized methods for diffusions in large graphs
PDF
Anti-differentiating Approximation Algorithms: PageRank and MinCut
PDF
Fast relaxation methods for the matrix exponential
PDF
Gaps between the theory and practice of large-scale matrix-based network comp...
PDF
MapReduce Tall-and-skinny QR and applications
PDF
Recommendation and graph algorithms in Hadoop and SQL
PDF
Personalized PageRank based community detection
Engineering Data Science Objectives for Social Network Analysis
Correlation clustering and community detection in graphs and networks
Spectral clustering with motifs and higher-order structures
Higher-order organization of complex networks
Spacey random walks and higher-order data analysis
Non-exhaustive, Overlapping K-means
Using Local Spectral Methods to Robustify Graph-Based Learning
Spacey random walks and higher order Markov chains
Localized methods in graph mining
PageRank Centrality of dynamic graph structures
Iterative methods with special structures
Big data matrix factorizations and Overlapping community detection in graphs
Anti-differentiating approximation algorithms: A case study with min-cuts, sp...
Localized methods for diffusions in large graphs
Anti-differentiating Approximation Algorithms: PageRank and MinCut
Fast relaxation methods for the matrix exponential
Gaps between the theory and practice of large-scale matrix-based network comp...
MapReduce Tall-and-skinny QR and applications
Recommendation and graph algorithms in Hadoop and SQL
Personalized PageRank based community detection
Ad

Recently uploaded (20)

PDF
Hybrid model detection and classification of lung cancer
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
PDF
Hindi spoken digit analysis for native and non-native speakers
PDF
August Patch Tuesday
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
DP Operators-handbook-extract for the Mautical Institute
PPTX
cloud_computing_Infrastucture_as_cloud_p
PDF
Enhancing emotion recognition model for a student engagement use case through...
PPTX
1. Introduction to Computer Programming.pptx
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
1 - Historical Antecedents, Social Consideration.pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PPTX
A Presentation on Artificial Intelligence
PDF
Heart disease approach using modified random forest and particle swarm optimi...
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
PDF
project resource management chapter-09.pdf
PPTX
OMC Textile Division Presentation 2021.pptx
Hybrid model detection and classification of lung cancer
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Accuracy of neural networks in brain wave diagnosis of schizophrenia
Hindi spoken digit analysis for native and non-native speakers
August Patch Tuesday
Digital-Transformation-Roadmap-for-Companies.pptx
MIND Revenue Release Quarter 2 2025 Press Release
DP Operators-handbook-extract for the Mautical Institute
cloud_computing_Infrastucture_as_cloud_p
Enhancing emotion recognition model for a student engagement use case through...
1. Introduction to Computer Programming.pptx
Agricultural_Statistics_at_a_Glance_2022_0.pdf
1 - Historical Antecedents, Social Consideration.pdf
Unlocking AI with Model Context Protocol (MCP)
A Presentation on Artificial Intelligence
Heart disease approach using modified random forest and particle swarm optimi...
Encapsulation_ Review paper, used for researhc scholars
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
project resource management chapter-09.pdf
OMC Textile Division Presentation 2021.pptx

Spectra of Large Network

  • 1. Tweet Along @dgleich Spectra of Large Networks David F. Gleich Sandia National Laboratories iCME la/opt seminar 24 February 2011 Thanks to Ali Pinar, Jaideep Ray, Tammy Supported by Sandia’s John von Neumann postdoctoral fellowship Kolda, C. Seshadhri, Rich Lehoucq, and and the DOE Office of Science’s ASCR Graphs project. Jure Leskovec for helpful discussions. Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-AC04-94AL85000. 2/24/2011 Stanford ICME seminar 1/36
  • 2. Tweet Along @dgleich There’s information inside the spectra Words in dictionary definitions Internet router network 111,000 vertices, 2.7M edges 192k vetices, 1.2M edges Also noted in a study of smaller graphs by Banerjee and Jost (2009) 2/24/2011 Stanford ICME seminar 2/36
  • 3. Tweet Along @dgleich Overview Graphs and their matrices Data for our experiments Computing spectra for large networks Issues with computing spectra Many examples of graph spectra Conclusion Future work Images taken from Stanford, flickr, and Purdue, respectively 2/24/2011 Stanford ICME seminar 3/36
  • 4. Tweet Along @dgleich GRAPHS As well as things we AND THEIR already know about graph spectra. MATRICES 2/24/2011 Stanford ICME seminar 4
  • 5. Tweet Along @dgleich Matrices from graphs Adjacency matrix Not covered   Signless Laplacian matrix   if   Incidence matrix (It is incidentally discussed) Seidel matrix Laplacian matrix   Random walk matrix     Modularity matrix   Normalized Laplacian matrix     Everything is undirected. 2/24/2011 Stanford ICME seminar 5/36
  • 6. Tweet Along @dgleich Why are we interested in the spectra? Modeling Properties Moments of the adjacency Anomalies Regularities Network Comparison Fay et al. 2010 – Weighted Spectral Density The network is as19971108 from Jure’s snap collect (a few thousand nodes) and we insert random connections from 50 nodes 2/24/2011 Stanford ICME seminar 6/36
  • 7. Tweet Along @dgleich Spectral bounds from Gerschgorin       (from a slightly different approach) 2/24/2011 Stanford ICME seminar 7/36
  • 8. Tweet Along @dgleich Semi-circle law Wigner’s semi-circle law Eigenvalues of a random symmetric matrix where each entry is independent have a semi-circle density Erdős–Rényi graphs obey a special case Image from Mathworld 2/24/2011 Stanford ICME seminar 8/36
  • 9. Tweet Along @dgleich Erdős–Rényi Semi-circles The eigenvalues of the adjacency matrix for n=1000, averaged over 10 trials Semi-circle with outlier if average degree is large enough. Observed by Farkas and in the book “Network Alignment” edited by Brandes (Chapter 14) 2/24/2011 Stanford ICME seminar 9/36
  • 10. Tweet Along @dgleich Previous results Farkas et al. : Significant deviation from the semi-circle law for the adjacency matrix Mihail and Papadimitriou : Leading eigenvalues of the adjacency matrix obey a power-law based on the degree-sequence Chung et al. : Normalized Laplacian still obeys a semi-circle law Banerjee and Jost : Study of types of patterns that emerge in evolving graph models – explain many features of the spectra 2/24/2011 Stanford ICME seminar 10/36
  • 11. Tweet Along @dgleich In comparison … We use “exact” computation of spectra, instead of approximation. We study “all” of the standard matrices over a range of large networks. Our “large” is bigger. We look at a few different random graph models. 2/24/2011 Stanford ICME seminar 11/36
  • 12. Tweet Along @dgleich DATA 2/24/2011 Stanford ICME seminar 12
  • 13. Tweet Along @dgleich Data sources SNAP Various 100s-100,000s SNAP-p2p Gnutella Network 5-60k, ~30 inst. SNAP-as-733 Autonomous Sys. ~5,000, 733 inst. SNAP-caida Router networks ~20,000, ~125 inst. Pajek Various 100s-100,000s Models Copying Model 1k-100k 9 inst. 324 gs Pref. Attach 1k-100k 9 inst. 164 gs Forest Fire 1k-100k 9 inst. 324 gs Mine Various 2k-500k Newman Various Arenas Various Porter Facebook 100 schools, 5k-60k IsoRank, Natalie Protein-Protein <10k , 4 graphs Thanks to all who make data available 2/24/2011 Stanford ICME seminar 13/36
  • 14. Tweet Along @dgleich Big graphs Arxiv 86376 1035126 Co-authorship Dblp 9356 356290 Co-authorship Dictionary(*) 111982 2750576 Word definitions Internet(*) 124651 414428 Routers Itdk0304 190914 1215220 Routers p2p-Gnutella(*) 62561 295756 Peer-to-peer Patents(*) 230686 1109898 Citations Roads 126146 323900 Roads Wordnet(*) 75606 240036 Word relationship Web-NotreDame 325729 2994268 Web 2/24/2011 Stanford ICME seminar 14/36
  • 15. Tweet Along @dgleich Models Preferential Attachment Start graph with a k-node clique. Add a new node and connect to k random nodes, chosen proportional to degree. Copying model Start graph with a k-node clique. Add a new node and pick a parent uniformly at random. Copy edges of parent and make an error with probability   Forest Fire Start graph with a k-node clique. Add a new node and pick a parent uniformly at random. Do a random “bfs’/”forest fire” and link to all nodes “burned” 2/24/2011 Stanford ICME seminar 15/36
  • 16. Tweet Along @dgleich COMPUTING SPECTRA OF LARGE NETWORKS 2/24/2011 Stanford ICME seminar 16
  • 17. Tweet Along @dgleich Matlab! Always a great starting point. My desktop has 24GB of RAM (less than $2500 now!) 24GB/8 bytes (per double) = 3 billion numbers ~ 50,000-by-50,000 matrix Possibilities D = eig(A) – needs twice the memory for A,D [V,D] = eig(A) – needs three times the memory for A,D,V These limit us to ~38000 and ~31000 respectively. 2/24/2011 Stanford ICME seminar 17/36
  • 18. Tweet Along @dgleich ScaLAPACK LAPACK with distributed memory dense matrices Scalapack uses a 2d block-cyclic dense matrix distribution 2/24/2011 Stanford ICME seminar 18/36
  • 19. Tweet Along @dgleich Eigenvalues with ScaLAPACK Mostly the same approach as in LAPACK 1. Reduce to tridiagonal form (most time consuming part) 2. Distribute tridiagonals to all processors 3. Each processor finds all eigenvalues 4. Each processor computes a subset of eigenvectors Using the MRRR algorithm steps 3 and 4 are more intricate and faster. MRRR due to Parlett and Dhillon; implemented in ScaLAPACK by Christof Vomel. 2/24/2011 Stanford ICME seminar 19/36
  • 20. Tweet Along @dgleich Alternatives Use ARPACK to get extrema Use ARPACK to get interior around   via the folded spectrum   Farkas et al. used this approach. 2/24/2011 Stanford ICME seminar 20/36
  • 21. Tweet Along @dgleich ISSUES WITH COMPUTING SPECTRA 2/24/2011 Stanford ICME seminar 21
  • 22. Tweet Along @dgleich Bugs - Matlab eig(A) Returns incorrect eigenvectors Seems to be the result of a bug in Intel’s MKL library. 2/24/2011 Stanford ICME seminar 22/36
  • 23. Tweet Along @dgleich Bug – ScaLAPACK default sudo apt-get install scalapack-openmpi Allocate 36000x36000 local matrix Run on 4 processors Code crashes 2/24/2011 Stanford ICME seminar 23/36
  • 24. Tweet Along @dgleich Bug - LAPACK Scalapack MRRR Compare standard lapack/blas to atlas performance Result: correct output from atlas Result: incorrect output from lapack Hypothesis: lapack contains a known bug that’s apparently in the default ubuntu lapack 2/24/2011 Stanford ICME seminar 24/36
  • 25. Tweet Along @dgleich Moral Always test your software. Extensively. 2/24/2011 Stanford ICME seminar 25/36
  • 26. Tweet Along @dgleich (Super)-Computers Redsky 2x Intel Nehalem 2.9 GHz/8 core 12 GB/node 22TB memory total used up to 500 nodes/ 6 TB memory Hopper (I) 2x AMD quad-core 2.4 GHz/8 cores 16 GB/node Cielo (testbed) 20 nodes 2/24/2011 Stanford ICME seminar 26/36
  • 27. Tweet Along @dgleich Adding MPI tasks vs. using threads Most math libraries have threaded versions (Intel MKL, AMD ACML) Is it better to use threads or MPI tasks? It depends. Threads Ranks Time Threads Ranks Time-T Time-E 1 64 1412.5 1 36 1271.4 339.0 4 16 1881.4 4 16 1058.1 456.6 16 4 Omitted. Normalized Laplacian for 36k-by-36k co-author graph of CondMat 2/24/2011 Stanford ICME seminar 27/36
  • 28. Tweet Along @dgleich Strong Parallel Scaling Time   Time in hours Good strong scaling up to 325,000 vertices Estimated time for 500,000 nodes 9 hours with 925 nodes (7400 procs)   2/24/2011 Stanford ICME seminar 28/36
  • 29. Tweet Along @dgleich EXAMPLES 2/24/2011 Stanford ICME seminar 29
  • 30. Tweet Along @dgleich A $40,000 matrix computation 2/24/2011 Stanford ICME seminar 30/36
  • 31. Tweet Along @dgleich Spikes? 2/24/2011 Stanford ICME seminar 31/36
  • 32. Tweet Along @dgleich Nullspaces of the adjacency matrix   So unit eigenvalues of the normalized Laplacian are null- vectors of the adjacency matrix. 2/24/2011 Stanford ICME seminar 32/36
  • 33. Tweet Along @dgleich Stanford’s Facebook Network Data from Mason Porter 2/24/2011 Stanford ICME seminar 33/36
  • 34. Tweet Along @dgleich Movies of spectra… As these models evolve, what do the spectra look like? 2/24/2011 Stanford ICME seminar 34/36
  • 35. Tweet Along @dgleich 2/24/2011 Stanford ICME seminar 35/36
  • 36. Code will be available eventually. Image from good financial cents. 2/24/2011 Stanford ICME seminar 36 of <Total>