SlideShare a Scribd company logo
Lecture 3:
Mathematics of Networks
CS 765: Complex Networks
Slides are modified from Networks: Theory and Application by Lada Adamic
What are networks?
 Networks are collections of points joined by lines.
“Network” ≡ “Graph”
points lines Domain
vertices edges, arcs math
nodes links computer science
sites bonds physics
actors ties, relations sociology
node
edge
2
Network elements: edges
 Directed (also called arcs)
 A -> B (EBA)
 A likes B, A gave a gift to B, A is B’s child
 Undirected
 A <-> B or A – B
 A and B like each other
 A and B are siblings
 A and B are co-authors
 Edge attributes
 weight (e.g. frequency of communication)
 ranking (best friend, second best friend…)
 type (friend, relative, co-worker)
 properties depending on the structure of the rest of the graph: e.g.
betweenness
 Multiedge: multiple edges between two pair of nodes
 Self-edge: from a node to itself
3
Directed networks
2
1
1
2
1
2
1
2
1
2
2
1
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2 1
2
1
2
1
2
1
2
1
2
1
2
1 2
1
2
Ada
Cora
Louise
Jean
Helen
Martha
Alice
Robin
Marion
Maxine
Lena
Hazel Hilda
Frances
Eva
Ruth
Edna
Adele
Jane
Anna
Mary
Betty
Ella
Ellen
Laura
Irene
 girls’ school dormitory dining-table partners (Moreno, The sociometry reader, 1960)
 first and second choices shown
4
Edge weights can have positive or negative values
 One gene activates/
inhibits another
 One person trusting/
distrusting another
 Research challenge:
 How does one
‘propagate’ negative
feelings in a social
network?
 Is my enemy’s enemy
my friend?
Transcription regulatory
network in baker’s yeast
5
Adjacency matrices
 Representing edges (who is adjacent to whom) as a
matrix
 Aij = 1 if node i has an edge to node j
= 0 if node i does not have an edge to j
 Aii = 0 unless the network has self-loops
 If self-loop, Aii=1
 Aij = Aji if the network is undirected,
or if i and j share a reciprocated edge
i
j
i
i
j
1
2
3
4
Example:
5
0 0 0 0 0
0 0 1 1 0
0 1 0 1 0
0 0 0 0 1
1 1 0 0 0
A =
6
Adjacency lists
 Edge list
 2 3
 2 4
 3 2
 3 4
 4 5
 5 2
 5 1
 Adjacency list
 is easier to work with if network is
 large
 sparse
 quickly retrieve all neighbors for a node
 1:
 2: 3 4
 3: 2 4
 4: 5
 5: 1 2
1
2
3
4
5
7
Nodes
 Node network properties
 from immediate connections
 indegree
how many directed edges (arcs) are incident on a node
 outdegree
how many directed edges (arcs) originate at a node
 degree (in or out)
number of edges incident on a node
outdegree=2
indegree=3
degree=5
8
HyperGraphs
 Edges join more than two nodes at a time (hyperEdge)
 Affliation networks
 Examples
 Families
 Subnetworks
Can be transformed to a bipartite network
9
C D
A B
C D
A B
Bipartite (two-mode) networks
 edges occur only between two groups of nodes, not
within those groups
 for example, we may have individuals and events
 directors and boards of directors
 customers and the items they purchase
 metabolites and the reactions they participate in
in matrix notation
 Bij
 = 1 if node i from the first group
links to node j from the second group
 = 0 otherwise
 B is usually not a square matrix!
 for example: we have n customers and m products
i
j
1 0 0 0
1 0 0 0
1 1 0 0
1 1 1 1
0 0 0 1
B =
going from a bipartite to a one-mode graph
 One mode projection
 two nodes from the first group
are connected if they link to the
same node in the second group
 naturally high occurrence of
cliques
 some loss of information
 Can use weighted edges to
preserve group occurrences
 Two-mode network
group 1
group 2
Collapsing to a one-mode network
 i and j are linked if they both link to k
 Pij = k Bik Bjk
 P’ = B BT
 the transpose of a matrix swaps Bxy and Byx
 if B is an nxm matrix, BT
is an mxn matrix
i
k=1
j
k=2
B = BT
=
1 0 0 0
1 0 0 0
1 1 0 0
1 1 1 1
0 0 0 1
1 1 1 1 0
0 0 1 1 0
0 0 0 1 0
0 0 0 1 1
Matrix multiplication
 general formula for matrix multiplication Zij= k Xik Ykj
 let Z = P’, X = B, Y = BT
1 0 0 0
1 0 0 0
1 1 0 0
1 1 1 1
0 0 0 1
P’ =
1 1 1 1 0
0 0 1 1 0
0 0 0 1 0
0 0 0 1 1
=
1 1 1 1 0
1 1 1 1 0
1 1 2 2 0
1 1 2 4 1
0 0 0 1 1
1 1
1 2
1
1 1 1 1 1
1
0
0
= 1*1+1*1
+ 1*0 + 1*0
= 2
Collapsing a two-mode network to a one mode-network
 Assume the nodes in group 1 are people and the nodes
in group 2 are movies
 P’ is symmetric
 The diagonal entries of P’ give the number of movies
each person has seen
 The off-diagonal elements of P’ give the number of
movies that both people have seen
P’ =
1 1 1 1 0
1 1 1 1 0
1 1 2 2 0
1 1 2 4 1
0 0 0 1 1
1 1
1 2
1
Trees
 Trees are undirected graphs that contain no cycles
 For n nodes, number of edges m = n-1
 Any node can be dedicated as the root
examples of trees
 In nature
 trees
 river networks
 arteries (or veins, but not both)
 Man made
 sewer system
 Computer science
 binary search trees
 decision trees (AI)
 Network analysis
 minimum spanning trees
 from one node – how to reach all other nodes most quickly
 may not be unique, because shortest paths are not always unique
 depends on weight of edges
Planar graphs
 A graph is planar if it can be drawn on a plane without
any edges crossing
Cliques and complete graphs
 Kn is the complete graph (clique) with K vertices
 each vertex is connected to every other vertex
 there are n*(n-1)/2 undirected edges
K5 K8
K3
Kuratowski’s theorem
 Every non-planar network contains at least one
subgraph that is an expansion of K5 or K3,3.
K5 K3,3
Expansion: Addition of new node in the middle of edges.
 Research challenge: Degree of planarity?
20
#s of planar graphs of different sizes
1:1
2:2
3:4
4:11
Every planar graph
has a straight line
embedding
Edge contractions defined
 A finite graph G is planar if and only if it has no subgraph that is
homeomorphic or edge-contractible to the complete graph in five vertices
(K5) or the complete bipartite graph K3, 3. (Kuratowski's Theorem)
Peterson graph
 Example of using edge contractions to show a graph is
not planar
Bi-cliques (cliques in bipartite graphs)
 Km,n is the complete bipartite graph with m and n vertices of the
two different types
 K3,3 maps to the utility graph
 Is there a way to connect three utilities, e.g. gas, water, electricity to
three houses without having any of the pipes cross?
K3,3
Utility graph
Node degree
 Outdegree =
0 0 0 0 0
0 0 1 1 0
0 1 0 1 0
0 0 0 0 1
1 1 0 0 0
A =


n
j
ij
A
1
example: outdegree for node 3 is 2, which
we obtain by summing the number of non-
zero entries in the 3rd
row
 Indegree =
0 0 0 0 0
0 0 1 1 0
0 1 0 1 0
0 0 0 0 1
1 1 0 0 0
A =


n
i
ij
A
1
example: the indegree for node 3 is 1,
which we obtain by summing the number of
non-zero entries in the 3rd
column


n
i
i
A
1
3


n
j
j
A
1
3
1
2
3
4
5
25
Degree sequence and Degree distribution
 Degree sequence: An ordered list of the (in,out) degree of each node
 In-degree sequence:
 [2, 2, 2, 1, 1, 1, 1, 0]
 Out-degree sequence:
 [2, 2, 2, 2, 1, 1, 1, 0]
 (undirected) degree sequence:
 [3, 3, 3, 2, 2, 1, 1, 1]
 Degree distribution: A frequency count of the occurrence of each degree
In-degree distribution:
[(2,3) (1,4) (0,1)]
Out-degree distribution:
[(2,4) (1,3) (0,1)]
(undirected) distribution:
[(3,3) (2,2) (1,3)]
0 1 2
0
1
2
3
4
5
indegree
frequency
26
Structural Metrics: Degree distribution
27
What if it is directed ?
Characterizing networks:
How dense are they?
network metrics: graph density
 Of the connections that may exist between n nodes
 directed graph
emax = n*(n-1)
 undirected graph
emax = n*(n-1)/2
 What fraction are present?
 density = e/ emax
 For example, out of 12 possible connections,
this graph has 7, giving it a density of 7/12 = 0.583
29
Graph density
30
 Would this measure be useful for comparing networks of
different sizes (different numbers of nodes)?
 As n → ∞, a graph whose density reaches
 0 is a sparse graph
 a constant is a dense graph
Characterizing networks:
How far apart are things?
31
Network metrics: paths
 A path is any sequence of vertices such that every
consecutive pair of vertices in the sequence is
connected by an edge in the network.
 For directed: traversed in the correct direction for the edges.
 path can visit itself (vertex or edge) more than once
 Self-avoiding paths do not intersect themselves.
 Path length r is the number of edges on the path
 Called hops
32
Network metrics: paths
33
Network metrics: shortest paths
A
B
C
D
E
1
2
2
3
3
34
3
Structural metrics:
Average path length
35
1 ≤ L ≤ D ≤ N-1
Eulerian Path
 Euler’s Seven Bridges of Königsberg
 one of the first problems in graph theory
 Is there a route that crosses each bridge only once and returns to
the starting point?
Source: http://en.wikipedia.org/wiki/Seven_Bridges_of_Königsberg
Image 1 – GNU v1.2: Bogdan, Wikipedia; http://commons.wikimedia.org/wiki/Commons:GNU_Free_Documentation_License
Image 2 – GNU v1.2: Booyabazooka, Wikipedia; http://commons.wikimedia.org/wiki/Commons:GNU_Free_Documentation_License
Image 3 – GNU v1.2: Riojajar, Wikipedia; http://commons.wikimedia.org/wiki/Commons:GNU_Free_Documentation_License
Eulerian and Hamiltonian paths
 Hamiltonian path is self avoiding
If starting point and end point are the same:
only possible if no nodes have an odd degree as each path must visit and leave
each shore
If don’t need to return to starting point
can have 0 or 2 nodes with an odd degree
Eulerian path: traverse each
edge exactly once
Hamiltonian path: visit
each vertex exactly once
Characterizing networks:
Is everything connected?
38
Network metrics: components
 If there is a path from every vertex in a network to every
other, the network is connected
 otherwise, it is disconnected
 Component: A subset of vertices such that there exist at
least one path from each member of the subset to others
and there does not exist another vertex in the network
which is connected to any vertex in the subset
 Maximal subset
 A singeleton vertex that is not connected to any other
forms a size one component
 Every vertex belongs to exactly one component
39
network metrics: size of giant component
 if the largest component encompasses a significant fraction of the graph,
it is called the giant component
40
components in directed networks
A
B
C
D
E
F
G
H
Weakly connected components
A B C D E
G H F
41
 Strongly connected components
 Each node within the component can be reached from every other node in the component
by following directed links
Strongly connected components
B C D E
A
G H
F
 Weakly connected components:
 every node can be reached from every other node by following links in either direction
A
B
C
D
E
F
G
H
components in directed networks
 Every strongly connected component of more than one
vertex has at least one cycle
 Out-component: set of all vertices that are reachable
via directed paths starting at a specific vertex v
 Out-components of all members of a strongly
connected component are identical
 In-component: set of all vertices from which there is a
direct path to a vertex v
 In-components of all members of a strongly connected
component are identical
42
A
B
C
D
E
F
G
H
bowtie model of the web
 The Web is a directed graph:
 webpages link to other webpages
 The connected components tell us what set of pages can
be reached from any other just by surfing
 no ‘jumping’ around by typing in a URL or using a search engine
 Broder et al. 1999 – crawl of over 200 million pages and
1.5 billion links.
 SCC – 27.5%
 IN and OUT – 21.5%
 Tendrils and tubes – 21.5%
 Disconnected – 8%
43
degree distribution
 indegree,  ~ 2.1
 outdegree,  ~ 2.4
source: Pennock et al.: Winners don't take all: Characterizing the competition for links on the web
PNAS April 16, 2002 vol. 99 no. 8 5207-5211
clustering & motifs
 clustering coefficient ~ 0.11 (at the site level)
Source: Milo et al., “Superfamilies of evolved and designed networks”, Science 303 (5663), p. 1538-1542, 2004.
shortest paths
 <d> = 0.35 + 2.06 log(N)
 prediction: <d> = 17.5 for 200 million nodes
 actual: <d> = 16 for reachable pairs
0 2 4 6 8 10
x 10
4
0
5
10
15
20
25
average
shortest
path
number of webpages
Network Analysis
 What is a network?
 a bunch of nodes and edges
 How do you characterize it?
 with some basic network metrics
 How did network analysis get started?
 it was the mathematicians
 How do you analyze networks today?
 with pajek or other software
overview of network analysis tools
Pajek
network analysis and visualization,
menu driven, suitable for large networks
platforms: Windows (on linux
via Wine)
download
Netlogo
agent based modeling
recently added network modeling capabilities
platforms: any (Java)
download
GUESS
network analysis and visualization,
extensible, script-driven (jython)
platforms: any (Java)
download
Other software tools that we will not be using but that you may find useful:
visualization and analysis:
UCInet - user friendly social network visualization and analysis software (suitable smaller networks)
iGraph - if you are familiar with R, you can use iGraph as a module to analyze or create large networks, or you can directly use the C functions
Jung - comprehensive Java library of network analysis, creation and visualization routines
Graph package for Matlab (untested?) - if Matlab is the environment you are most comfortable in, here are some basic routines
SIENA - for p* models and longitudinal analysis
SNA package for R - all sorts of analysis + heavy duty stats to boot
NetworkX - python based free package for analysis of large graphs
InfoVis Cyberinfrastructure - large agglomeration of network analysis tools/routines, partly menu driven
visualization only:
GraphViz - open source network visualization software (can handle large/specialized networks)
TouchGraph - need to quickly create an interactive visualization for the web?
yEd - free, graph visualization and editing software
specialized:
fast community finding algorithm
motif profiles
CLAIR library - NLP and IR library (Perl Based) includes network analysis routines
finally: INSNA long list of SNA packages
tools we’ll use
 Pajek: extensive menu-driven functionality, including many,
many network metrics and manipulations
 but… not extensible
 Guess: extensible, scriptable tool of exploratory data analysis,
but more limited selection of built-in methods compared to
Pajek
 NetLogo: general agent based simulation platform with
excellent network modeling support
 many of the demos in this course were built with NetLogo
 iGraph: libraries can be accessed through R or python.
Routines scale to millions of nodes.
other tools: visualization tool: gephi
 http://gephi.org
 primarily for visualization, has some nice touches
http://player.vimeo.com/video/9726202
visualization tool: GraphViz
 Takes descriptions of graphs in simple text languages
 Outputs images in useful formats
 Options for shapes and colors
 Standalone or use as a library
 dot: hierarchical or layered drawings of directed graphs,
by avoiding edge crossings and reducing edge length
 neato (Kamada-Kawai) and fdp (Fruchterman-Reinhold
with heuristics to handle larger graphs)
 twopi – radial layout
 circo – circular layout
http://www.graphviz.org
GraphViz: dot language
digraph G {
ranksep=4
nodesep=0.1
size="8,11"
ARCH531_20061 [label="ARCH531",style=bold,color=yellow,style=filled]
ARCH531_20071 [label="ARCH531",gstyle=bold,color=yellow,style=filled]
BIT512_20071 [label="BIT512",gstyle=bold,color=yellow,style=filled]
BIT513_20071 [label="BIT513",gstyle=bold,color=yellow,style=filled]
BIT646_20064 [label="BIT646",gstyle=bold,color=yellow,style=filled]
BIT648_20064 [label="BIT648",gstyle=bold,color=yellow,style=filled]
DESCI502_20071 [label="DESCI502",gstyle=bold,color=yellow,style=filled]
ECON500_20064 [label="ECON500",gstyle=bold,color=yellow,style=filled]
…
…
SI791_20064->SI549_20064[weight=2,color=slategray,style="setlinewidth(4)"]SI791_20064-
>SI596_20071[weight=5,color=slategray,style=bold,style="setlinewidth(10)"]SI791_20064-
>SI616_20071[weight=2,color=slategray,style=bold,style="setlinewidth(4)"]SI791_20064-
>SI702_20071[weight=2,color=slategray,style=bold,style="setlinewidth(4)"]SI791_20064-
>SI719_20071[weight=2,color=slategray,style=bold,style="setlinewidth(4)"]
Dot (GraphViz)
Lada’s school of information course
recommender (GraphViz)
ARCH531
BIT545 BIT645
BIT750
IOE491 MO501 SI512
SI514
SI543
SI551 SI554
SI557 SI575
SI605
SI622
SI650
SI654 SI663
SI684
SI688
SI884
COMM810
EECS492
IOE536 MKT501
SI504 SI539
SI553 SI599 SI625
SI627 SI628
SI644
SI647 SI649
SI653
SI658 SI668
SI681 SI682
SI689 SI699
SI702
ELI321
SI622 SI690 RACKHAM998 SI512
SI539
SI607
SI540
SI543 SI605 SI702
SI615
SI625
SI654
SI658
SI670
SI682
SI688 SI689
SI702
SI791
MHS663 RACKHAM575
SI502 SI515 SI581
SI596
SI615
SI616 SI620 SI621
SI626
SI643
SI646
SI655 SI690 SI692
SI696
SI702 SI792
COMM810 EDCURINS575
EDUC601 ENGLISH516 HISTORY698
MHS663 SI540
SI575 SI579
SI596
SI624 SI629
SI637
SI665
SI666 SI690 SI791
SI901
SI501
SI502
SI503
SI504
SI515 SI557
SI575 SI580
SI581 SI632 SI655
SI692
SI596
SI626
SI643 SI596
SI601 SI620
SI624
SI792
SI640
SI647
SI674 SI663
SI665
SI667
SI690
Lada’s school of information course recommender (GraphViz)
Neato (Graphviz)
Other visualization tools: Walrus
 developed at CAIDA available under the GNU GPL.
 “…best suited to visualizing moderately sized graphs that are
nearly trees. A graph with a few hundred thousand nodes and
only a slightly greater number of links is likely to be
comfortable to work with.”
 Java-based
 Implemented Features
 rendering at a guaranteed frame rate regardless of graph size
 coloring nodes and links with a fixed color, or by RGB values
stored in attributes
 labeling nodes
 picking nodes to examine attribute values
 displaying a subset of nodes or links based on a user-supplied
boolean attribute
 interactive pruning of the graph to temporarily reduce clutter and
occlusion
 zooming in and out
Source: CAIDA, http://www.caida.org/tools/visualization/walrus/
visualization tools: YEd - JavaTM Graph Editor
http://www.yworks.com/en/products_yed_about.htm
(good primarily for layouts)
yEd and 26,000 nodes
(takes a few seconds)
visualization tools: Prefuse
 user interface toolkit for interactive information visualization
 built in Java using Java2D graphics library
 data structures and algorithms
 pipeline architecture featuring reusable, composable modules
 animation and rendering support
 architectural techniques for scalability
 requires knowledge of Java programming
 website: http://prefuse.sourceforge.net
Simple prefuse visualizations
Source: Prefuse, https://github.com/prefuse
Examples of prefuse applications: flow maps
A flow map of migration from California from
1995-2000, generated automatically by our
system using edge routing but no layout
adjustment.
 http://graphics.stanford.edu/papers/flow_map_layout/
Examples of prefuse applications: vizster
 http://jheer.org/vizster
Outline
 Network metrics can help us characterize networks
 This has is roots in graph theory
 Today there are many network analysis tools to choose
from
 though most of them are in beta!

More Related Content

PPTX
Network Measures: Characterizing networks
PPTX
Slides BARABASI BARA BASA Barabasi.ppt.pptx
PDF
Higher-order clustering coefficients
PPTX
Tensor Spectral Clustering
PDF
Interpretation of the biological knowledge using networks approach
PDF
Multiplex Networks: structure and dynamics
PPTX
Pattern Recognition and Machine Learning : Graphical Models
DOC
240164036 ee2092-4-2011-matrix-analysis
Network Measures: Characterizing networks
Slides BARABASI BARA BASA Barabasi.ppt.pptx
Higher-order clustering coefficients
Tensor Spectral Clustering
Interpretation of the biological knowledge using networks approach
Multiplex Networks: structure and dynamics
Pattern Recognition and Machine Learning : Graphical Models
240164036 ee2092-4-2011-matrix-analysis

Similar to mathematics of network science: basic definitions (20)

PDF
Exploratory social network analysis with pajek
PPTX
Advanced Modularity Optimization Assignment Help
PDF
新たなRNNと自然言語処理
PDF
Analysis and design of a half hypercube interconnection network topology
PDF
Transport and routing on coupled spatial networks
PPTX
Learning multifractal structure in large networks (Purdue ML Seminar)
PDF
Interactive High-Dimensional Visualization of Social Graphs
PDF
ICPSR - Complex Systems Models in the Social Sciences - Lecture 3 - Professor...
PPTX
Higher-order clustering coefficients at Purdue CSoI
PDF
Interconnection Network
PDF
Network topology
PPT
Lecture 5b graphs and hashing
PPTX
Higher-order clustering coefficients
PPT
Network Models - modeling and simulation(lecture Three).ppt
PPT
Recreation mathematics ppt
PPTX
APznzaZLM_MVouyxM4cxHPJR5BC-TAxTWqhQJ2EywQQuXStxJTDoGkHdsKEQGd4Vo7BS3Q1npCOMV...
PDF
Using spectral radius ratio for node degree
PDF
Higher-order graph clustering at AMS Spring Western Sectional
PDF
Higher-order Link Prediction GraphEx
PDF
Small world
Exploratory social network analysis with pajek
Advanced Modularity Optimization Assignment Help
新たなRNNと自然言語処理
Analysis and design of a half hypercube interconnection network topology
Transport and routing on coupled spatial networks
Learning multifractal structure in large networks (Purdue ML Seminar)
Interactive High-Dimensional Visualization of Social Graphs
ICPSR - Complex Systems Models in the Social Sciences - Lecture 3 - Professor...
Higher-order clustering coefficients at Purdue CSoI
Interconnection Network
Network topology
Lecture 5b graphs and hashing
Higher-order clustering coefficients
Network Models - modeling and simulation(lecture Three).ppt
Recreation mathematics ppt
APznzaZLM_MVouyxM4cxHPJR5BC-TAxTWqhQJ2EywQQuXStxJTDoGkHdsKEQGd4Vo7BS3Q1npCOMV...
Using spectral radius ratio for node degree
Higher-order graph clustering at AMS Spring Western Sectional
Higher-order Link Prediction GraphEx
Small world
Ad

Recently uploaded (20)

PPTX
Table Top Exercise (TTEx) on Emergency.pptx
PDF
A copy of a Medium article wishing Merry Christmas To All My Followers
PDF
Medium @mikehydes The Cryptomaster Home page
PDF
Medium @mikehydes The Cryptomaster About page
PPTX
Result-Driven Social Media Marketing Services | Boost ROI
PPTX
Developing lesson plan gejegkavbw gagsgf
PPTX
Strategies for Social Media App Enhancement
PPTX
Office Administration Courses in Trivandrum That Employers Value.pptx
PDF
COMMENTIFY - Commentify.co: Your AI LinkedIn Comments Agent
PDF
Instagram Reels Growth Guide 2025.......
PPTX
Preposition and Asking and Responding Suggestion.pptx
PDF
The Edge You’ve Been Missing Get the Sociocosmos Edge
PDF
Instant Audience, Long-Term Impact Buy Real Telegram Members
PDF
Presence That Pays Off Activate My Social Growth
PDF
Subscribe This Channel Subscribe Back You
PDF
11111111111111111111111111111111111111111111111
PPTX
Types of Social Media Marketing for Business Success
PDF
Medium @mikehydes The Cryptomaster Audience Stats
PPTX
How Social Media Influencers Repurpose Content (1).pptx
PDF
Medium @mikehydes The Cryptomaster Story Stats
Table Top Exercise (TTEx) on Emergency.pptx
A copy of a Medium article wishing Merry Christmas To All My Followers
Medium @mikehydes The Cryptomaster Home page
Medium @mikehydes The Cryptomaster About page
Result-Driven Social Media Marketing Services | Boost ROI
Developing lesson plan gejegkavbw gagsgf
Strategies for Social Media App Enhancement
Office Administration Courses in Trivandrum That Employers Value.pptx
COMMENTIFY - Commentify.co: Your AI LinkedIn Comments Agent
Instagram Reels Growth Guide 2025.......
Preposition and Asking and Responding Suggestion.pptx
The Edge You’ve Been Missing Get the Sociocosmos Edge
Instant Audience, Long-Term Impact Buy Real Telegram Members
Presence That Pays Off Activate My Social Growth
Subscribe This Channel Subscribe Back You
11111111111111111111111111111111111111111111111
Types of Social Media Marketing for Business Success
Medium @mikehydes The Cryptomaster Audience Stats
How Social Media Influencers Repurpose Content (1).pptx
Medium @mikehydes The Cryptomaster Story Stats
Ad

mathematics of network science: basic definitions

  • 1. Lecture 3: Mathematics of Networks CS 765: Complex Networks Slides are modified from Networks: Theory and Application by Lada Adamic
  • 2. What are networks?  Networks are collections of points joined by lines. “Network” ≡ “Graph” points lines Domain vertices edges, arcs math nodes links computer science sites bonds physics actors ties, relations sociology node edge 2
  • 3. Network elements: edges  Directed (also called arcs)  A -> B (EBA)  A likes B, A gave a gift to B, A is B’s child  Undirected  A <-> B or A – B  A and B like each other  A and B are siblings  A and B are co-authors  Edge attributes  weight (e.g. frequency of communication)  ranking (best friend, second best friend…)  type (friend, relative, co-worker)  properties depending on the structure of the rest of the graph: e.g. betweenness  Multiedge: multiple edges between two pair of nodes  Self-edge: from a node to itself 3
  • 4. Directed networks 2 1 1 2 1 2 1 2 1 2 2 1 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 Ada Cora Louise Jean Helen Martha Alice Robin Marion Maxine Lena Hazel Hilda Frances Eva Ruth Edna Adele Jane Anna Mary Betty Ella Ellen Laura Irene  girls’ school dormitory dining-table partners (Moreno, The sociometry reader, 1960)  first and second choices shown 4
  • 5. Edge weights can have positive or negative values  One gene activates/ inhibits another  One person trusting/ distrusting another  Research challenge:  How does one ‘propagate’ negative feelings in a social network?  Is my enemy’s enemy my friend? Transcription regulatory network in baker’s yeast 5
  • 6. Adjacency matrices  Representing edges (who is adjacent to whom) as a matrix  Aij = 1 if node i has an edge to node j = 0 if node i does not have an edge to j  Aii = 0 unless the network has self-loops  If self-loop, Aii=1  Aij = Aji if the network is undirected, or if i and j share a reciprocated edge i j i i j 1 2 3 4 Example: 5 0 0 0 0 0 0 0 1 1 0 0 1 0 1 0 0 0 0 0 1 1 1 0 0 0 A = 6
  • 7. Adjacency lists  Edge list  2 3  2 4  3 2  3 4  4 5  5 2  5 1  Adjacency list  is easier to work with if network is  large  sparse  quickly retrieve all neighbors for a node  1:  2: 3 4  3: 2 4  4: 5  5: 1 2 1 2 3 4 5 7
  • 8. Nodes  Node network properties  from immediate connections  indegree how many directed edges (arcs) are incident on a node  outdegree how many directed edges (arcs) originate at a node  degree (in or out) number of edges incident on a node outdegree=2 indegree=3 degree=5 8
  • 9. HyperGraphs  Edges join more than two nodes at a time (hyperEdge)  Affliation networks  Examples  Families  Subnetworks Can be transformed to a bipartite network 9 C D A B C D A B
  • 10. Bipartite (two-mode) networks  edges occur only between two groups of nodes, not within those groups  for example, we may have individuals and events  directors and boards of directors  customers and the items they purchase  metabolites and the reactions they participate in
  • 11. in matrix notation  Bij  = 1 if node i from the first group links to node j from the second group  = 0 otherwise  B is usually not a square matrix!  for example: we have n customers and m products i j 1 0 0 0 1 0 0 0 1 1 0 0 1 1 1 1 0 0 0 1 B =
  • 12. going from a bipartite to a one-mode graph  One mode projection  two nodes from the first group are connected if they link to the same node in the second group  naturally high occurrence of cliques  some loss of information  Can use weighted edges to preserve group occurrences  Two-mode network group 1 group 2
  • 13. Collapsing to a one-mode network  i and j are linked if they both link to k  Pij = k Bik Bjk  P’ = B BT  the transpose of a matrix swaps Bxy and Byx  if B is an nxm matrix, BT is an mxn matrix i k=1 j k=2 B = BT = 1 0 0 0 1 0 0 0 1 1 0 0 1 1 1 1 0 0 0 1 1 1 1 1 0 0 0 1 1 0 0 0 0 1 0 0 0 0 1 1
  • 14. Matrix multiplication  general formula for matrix multiplication Zij= k Xik Ykj  let Z = P’, X = B, Y = BT 1 0 0 0 1 0 0 0 1 1 0 0 1 1 1 1 0 0 0 1 P’ = 1 1 1 1 0 0 0 1 1 0 0 0 0 1 0 0 0 0 1 1 = 1 1 1 1 0 1 1 1 1 0 1 1 2 2 0 1 1 2 4 1 0 0 0 1 1 1 1 1 2 1 1 1 1 1 1 1 0 0 = 1*1+1*1 + 1*0 + 1*0 = 2
  • 15. Collapsing a two-mode network to a one mode-network  Assume the nodes in group 1 are people and the nodes in group 2 are movies  P’ is symmetric  The diagonal entries of P’ give the number of movies each person has seen  The off-diagonal elements of P’ give the number of movies that both people have seen P’ = 1 1 1 1 0 1 1 1 1 0 1 1 2 2 0 1 1 2 4 1 0 0 0 1 1 1 1 1 2 1
  • 16. Trees  Trees are undirected graphs that contain no cycles  For n nodes, number of edges m = n-1  Any node can be dedicated as the root
  • 17. examples of trees  In nature  trees  river networks  arteries (or veins, but not both)  Man made  sewer system  Computer science  binary search trees  decision trees (AI)  Network analysis  minimum spanning trees  from one node – how to reach all other nodes most quickly  may not be unique, because shortest paths are not always unique  depends on weight of edges
  • 18. Planar graphs  A graph is planar if it can be drawn on a plane without any edges crossing
  • 19. Cliques and complete graphs  Kn is the complete graph (clique) with K vertices  each vertex is connected to every other vertex  there are n*(n-1)/2 undirected edges K5 K8 K3
  • 20. Kuratowski’s theorem  Every non-planar network contains at least one subgraph that is an expansion of K5 or K3,3. K5 K3,3 Expansion: Addition of new node in the middle of edges.  Research challenge: Degree of planarity? 20
  • 21. #s of planar graphs of different sizes 1:1 2:2 3:4 4:11 Every planar graph has a straight line embedding
  • 22. Edge contractions defined  A finite graph G is planar if and only if it has no subgraph that is homeomorphic or edge-contractible to the complete graph in five vertices (K5) or the complete bipartite graph K3, 3. (Kuratowski's Theorem)
  • 23. Peterson graph  Example of using edge contractions to show a graph is not planar
  • 24. Bi-cliques (cliques in bipartite graphs)  Km,n is the complete bipartite graph with m and n vertices of the two different types  K3,3 maps to the utility graph  Is there a way to connect three utilities, e.g. gas, water, electricity to three houses without having any of the pipes cross? K3,3 Utility graph
  • 25. Node degree  Outdegree = 0 0 0 0 0 0 0 1 1 0 0 1 0 1 0 0 0 0 0 1 1 1 0 0 0 A =   n j ij A 1 example: outdegree for node 3 is 2, which we obtain by summing the number of non- zero entries in the 3rd row  Indegree = 0 0 0 0 0 0 0 1 1 0 0 1 0 1 0 0 0 0 0 1 1 1 0 0 0 A =   n i ij A 1 example: the indegree for node 3 is 1, which we obtain by summing the number of non-zero entries in the 3rd column   n i i A 1 3   n j j A 1 3 1 2 3 4 5 25
  • 26. Degree sequence and Degree distribution  Degree sequence: An ordered list of the (in,out) degree of each node  In-degree sequence:  [2, 2, 2, 1, 1, 1, 1, 0]  Out-degree sequence:  [2, 2, 2, 2, 1, 1, 1, 0]  (undirected) degree sequence:  [3, 3, 3, 2, 2, 1, 1, 1]  Degree distribution: A frequency count of the occurrence of each degree In-degree distribution: [(2,3) (1,4) (0,1)] Out-degree distribution: [(2,4) (1,3) (0,1)] (undirected) distribution: [(3,3) (2,2) (1,3)] 0 1 2 0 1 2 3 4 5 indegree frequency 26
  • 27. Structural Metrics: Degree distribution 27 What if it is directed ?
  • 29. network metrics: graph density  Of the connections that may exist between n nodes  directed graph emax = n*(n-1)  undirected graph emax = n*(n-1)/2  What fraction are present?  density = e/ emax  For example, out of 12 possible connections, this graph has 7, giving it a density of 7/12 = 0.583 29
  • 30. Graph density 30  Would this measure be useful for comparing networks of different sizes (different numbers of nodes)?  As n → ∞, a graph whose density reaches  0 is a sparse graph  a constant is a dense graph
  • 31. Characterizing networks: How far apart are things? 31
  • 32. Network metrics: paths  A path is any sequence of vertices such that every consecutive pair of vertices in the sequence is connected by an edge in the network.  For directed: traversed in the correct direction for the edges.  path can visit itself (vertex or edge) more than once  Self-avoiding paths do not intersect themselves.  Path length r is the number of edges on the path  Called hops 32
  • 34. Network metrics: shortest paths A B C D E 1 2 2 3 3 34 3
  • 35. Structural metrics: Average path length 35 1 ≤ L ≤ D ≤ N-1
  • 36. Eulerian Path  Euler’s Seven Bridges of Königsberg  one of the first problems in graph theory  Is there a route that crosses each bridge only once and returns to the starting point? Source: http://en.wikipedia.org/wiki/Seven_Bridges_of_Königsberg Image 1 – GNU v1.2: Bogdan, Wikipedia; http://commons.wikimedia.org/wiki/Commons:GNU_Free_Documentation_License Image 2 – GNU v1.2: Booyabazooka, Wikipedia; http://commons.wikimedia.org/wiki/Commons:GNU_Free_Documentation_License Image 3 – GNU v1.2: Riojajar, Wikipedia; http://commons.wikimedia.org/wiki/Commons:GNU_Free_Documentation_License
  • 37. Eulerian and Hamiltonian paths  Hamiltonian path is self avoiding If starting point and end point are the same: only possible if no nodes have an odd degree as each path must visit and leave each shore If don’t need to return to starting point can have 0 or 2 nodes with an odd degree Eulerian path: traverse each edge exactly once Hamiltonian path: visit each vertex exactly once
  • 39. Network metrics: components  If there is a path from every vertex in a network to every other, the network is connected  otherwise, it is disconnected  Component: A subset of vertices such that there exist at least one path from each member of the subset to others and there does not exist another vertex in the network which is connected to any vertex in the subset  Maximal subset  A singeleton vertex that is not connected to any other forms a size one component  Every vertex belongs to exactly one component 39
  • 40. network metrics: size of giant component  if the largest component encompasses a significant fraction of the graph, it is called the giant component 40
  • 41. components in directed networks A B C D E F G H Weakly connected components A B C D E G H F 41  Strongly connected components  Each node within the component can be reached from every other node in the component by following directed links Strongly connected components B C D E A G H F  Weakly connected components:  every node can be reached from every other node by following links in either direction A B C D E F G H
  • 42. components in directed networks  Every strongly connected component of more than one vertex has at least one cycle  Out-component: set of all vertices that are reachable via directed paths starting at a specific vertex v  Out-components of all members of a strongly connected component are identical  In-component: set of all vertices from which there is a direct path to a vertex v  In-components of all members of a strongly connected component are identical 42 A B C D E F G H
  • 43. bowtie model of the web  The Web is a directed graph:  webpages link to other webpages  The connected components tell us what set of pages can be reached from any other just by surfing  no ‘jumping’ around by typing in a URL or using a search engine  Broder et al. 1999 – crawl of over 200 million pages and 1.5 billion links.  SCC – 27.5%  IN and OUT – 21.5%  Tendrils and tubes – 21.5%  Disconnected – 8% 43
  • 44. degree distribution  indegree,  ~ 2.1  outdegree,  ~ 2.4 source: Pennock et al.: Winners don't take all: Characterizing the competition for links on the web PNAS April 16, 2002 vol. 99 no. 8 5207-5211
  • 45. clustering & motifs  clustering coefficient ~ 0.11 (at the site level) Source: Milo et al., “Superfamilies of evolved and designed networks”, Science 303 (5663), p. 1538-1542, 2004.
  • 46. shortest paths  <d> = 0.35 + 2.06 log(N)  prediction: <d> = 17.5 for 200 million nodes  actual: <d> = 16 for reachable pairs 0 2 4 6 8 10 x 10 4 0 5 10 15 20 25 average shortest path number of webpages
  • 47. Network Analysis  What is a network?  a bunch of nodes and edges  How do you characterize it?  with some basic network metrics  How did network analysis get started?  it was the mathematicians  How do you analyze networks today?  with pajek or other software
  • 48. overview of network analysis tools Pajek network analysis and visualization, menu driven, suitable for large networks platforms: Windows (on linux via Wine) download Netlogo agent based modeling recently added network modeling capabilities platforms: any (Java) download GUESS network analysis and visualization, extensible, script-driven (jython) platforms: any (Java) download Other software tools that we will not be using but that you may find useful: visualization and analysis: UCInet - user friendly social network visualization and analysis software (suitable smaller networks) iGraph - if you are familiar with R, you can use iGraph as a module to analyze or create large networks, or you can directly use the C functions Jung - comprehensive Java library of network analysis, creation and visualization routines Graph package for Matlab (untested?) - if Matlab is the environment you are most comfortable in, here are some basic routines SIENA - for p* models and longitudinal analysis SNA package for R - all sorts of analysis + heavy duty stats to boot NetworkX - python based free package for analysis of large graphs InfoVis Cyberinfrastructure - large agglomeration of network analysis tools/routines, partly menu driven visualization only: GraphViz - open source network visualization software (can handle large/specialized networks) TouchGraph - need to quickly create an interactive visualization for the web? yEd - free, graph visualization and editing software specialized: fast community finding algorithm motif profiles CLAIR library - NLP and IR library (Perl Based) includes network analysis routines finally: INSNA long list of SNA packages
  • 49. tools we’ll use  Pajek: extensive menu-driven functionality, including many, many network metrics and manipulations  but… not extensible  Guess: extensible, scriptable tool of exploratory data analysis, but more limited selection of built-in methods compared to Pajek  NetLogo: general agent based simulation platform with excellent network modeling support  many of the demos in this course were built with NetLogo  iGraph: libraries can be accessed through R or python. Routines scale to millions of nodes.
  • 50. other tools: visualization tool: gephi  http://gephi.org  primarily for visualization, has some nice touches http://player.vimeo.com/video/9726202
  • 51. visualization tool: GraphViz  Takes descriptions of graphs in simple text languages  Outputs images in useful formats  Options for shapes and colors  Standalone or use as a library  dot: hierarchical or layered drawings of directed graphs, by avoiding edge crossings and reducing edge length  neato (Kamada-Kawai) and fdp (Fruchterman-Reinhold with heuristics to handle larger graphs)  twopi – radial layout  circo – circular layout http://www.graphviz.org
  • 52. GraphViz: dot language digraph G { ranksep=4 nodesep=0.1 size="8,11" ARCH531_20061 [label="ARCH531",style=bold,color=yellow,style=filled] ARCH531_20071 [label="ARCH531",gstyle=bold,color=yellow,style=filled] BIT512_20071 [label="BIT512",gstyle=bold,color=yellow,style=filled] BIT513_20071 [label="BIT513",gstyle=bold,color=yellow,style=filled] BIT646_20064 [label="BIT646",gstyle=bold,color=yellow,style=filled] BIT648_20064 [label="BIT648",gstyle=bold,color=yellow,style=filled] DESCI502_20071 [label="DESCI502",gstyle=bold,color=yellow,style=filled] ECON500_20064 [label="ECON500",gstyle=bold,color=yellow,style=filled] … … SI791_20064->SI549_20064[weight=2,color=slategray,style="setlinewidth(4)"]SI791_20064- >SI596_20071[weight=5,color=slategray,style=bold,style="setlinewidth(10)"]SI791_20064- >SI616_20071[weight=2,color=slategray,style=bold,style="setlinewidth(4)"]SI791_20064- >SI702_20071[weight=2,color=slategray,style=bold,style="setlinewidth(4)"]SI791_20064- >SI719_20071[weight=2,color=slategray,style=bold,style="setlinewidth(4)"]
  • 54. Lada’s school of information course recommender (GraphViz) ARCH531 BIT545 BIT645 BIT750 IOE491 MO501 SI512 SI514 SI543 SI551 SI554 SI557 SI575 SI605 SI622 SI650 SI654 SI663 SI684 SI688 SI884 COMM810 EECS492 IOE536 MKT501 SI504 SI539 SI553 SI599 SI625 SI627 SI628 SI644 SI647 SI649 SI653 SI658 SI668 SI681 SI682 SI689 SI699 SI702 ELI321 SI622 SI690 RACKHAM998 SI512 SI539 SI607 SI540 SI543 SI605 SI702 SI615 SI625 SI654 SI658 SI670 SI682 SI688 SI689 SI702 SI791
  • 55. MHS663 RACKHAM575 SI502 SI515 SI581 SI596 SI615 SI616 SI620 SI621 SI626 SI643 SI646 SI655 SI690 SI692 SI696 SI702 SI792 COMM810 EDCURINS575 EDUC601 ENGLISH516 HISTORY698 MHS663 SI540 SI575 SI579 SI596 SI624 SI629 SI637 SI665 SI666 SI690 SI791 SI901 SI501 SI502 SI503 SI504 SI515 SI557 SI575 SI580 SI581 SI632 SI655 SI692 SI596 SI626 SI643 SI596 SI601 SI620 SI624 SI792 SI640 SI647 SI674 SI663 SI665 SI667 SI690 Lada’s school of information course recommender (GraphViz)
  • 57. Other visualization tools: Walrus  developed at CAIDA available under the GNU GPL.  “…best suited to visualizing moderately sized graphs that are nearly trees. A graph with a few hundred thousand nodes and only a slightly greater number of links is likely to be comfortable to work with.”  Java-based  Implemented Features  rendering at a guaranteed frame rate regardless of graph size  coloring nodes and links with a fixed color, or by RGB values stored in attributes  labeling nodes  picking nodes to examine attribute values  displaying a subset of nodes or links based on a user-supplied boolean attribute  interactive pruning of the graph to temporarily reduce clutter and occlusion  zooming in and out Source: CAIDA, http://www.caida.org/tools/visualization/walrus/
  • 58. visualization tools: YEd - JavaTM Graph Editor http://www.yworks.com/en/products_yed_about.htm (good primarily for layouts)
  • 59. yEd and 26,000 nodes (takes a few seconds)
  • 60. visualization tools: Prefuse  user interface toolkit for interactive information visualization  built in Java using Java2D graphics library  data structures and algorithms  pipeline architecture featuring reusable, composable modules  animation and rendering support  architectural techniques for scalability  requires knowledge of Java programming  website: http://prefuse.sourceforge.net
  • 61. Simple prefuse visualizations Source: Prefuse, https://github.com/prefuse
  • 62. Examples of prefuse applications: flow maps A flow map of migration from California from 1995-2000, generated automatically by our system using edge routing but no layout adjustment.  http://graphics.stanford.edu/papers/flow_map_layout/
  • 63. Examples of prefuse applications: vizster  http://jheer.org/vizster
  • 64. Outline  Network metrics can help us characterize networks  This has is roots in graph theory  Today there are many network analysis tools to choose from  though most of them are in beta!