SlideShare a Scribd company logo
Introduction to Network Science
Riadh DHAOU based on the lecture of
Albert-László Barabási
FROM SADDAM HUSSEIN TO
NETWORK THEORY
Section 2: FROM SADDAM HUSSEIN TO NETWORK THEORY
Network Science: Introduction
Network Science: Introduction
A SIMPLE STORY (1) The fate of Saddam and network science
Network Science: Introduction
Thex
Network Science: Introduction
The capture of Saddam Hussein:
shows the strong predictive power of networks.
underlies the need to obtain accurate maps of the networks we aim to study;
and the often heroic difficulties we encounter during the mapping process.
demonstrates the remarkable stability of these networks: The capture of
Hussein was not based on fresh intelligence, but rather on his pre-invasion social
links, unearthed from old photos stacked in his family album.
shows that the choice of network we focus on makes a huge difference: the
hierarchical tree, that captured the official organization of the Iraqi government,
was of no use when it came to Saddam Hussein's whereabouts.
A SIMPLE STORY (1) The fate of Saddam and network science
VULNERABILITY DUE TO
INTERCONNECTIVITY
Section 3 VULNERABILITY DUE TO INTERCONNECTIVITY
Network Science: Introduction
Thex
Network Science: Introduction
A SIMPLE STORY (2): August 15, 2003 blackout.
August 14, 2003: 9:29pm EDT
20 hours before
August 15, 2003: 9:14pm EDT
7 hours after
Thex
Network Science: Introduction
A SIMPLE STORY (2): August 15, 2003 blackout.
An important theme of this class:
we must understand how network structure affects the robustness of a
complex system.
develop quantitative tools to assess the interplay between network
structure and the dynamical processes on the networks, and their impact on
failures.
We will learn that failures reality failures follow reproducible laws, that
can be quantified and even predicted using the tools of network science.
NETWORKS AT THE HEART OF
COMPLEX SYSTEMS
Section 4 NETWORKS AT THE HEART OF COMPLEX SYSTEMS
Network Science: Introduction
[adj., v. kuh m-pleks, kom-pleks; n. kom-pleks]
–adjective
1.
composed of many interconnected parts;
compound; composite: a complex highway
system.
2.
characterized by a very complicated or
involved arrangement of parts, units, etc.:
complex machinery.
3.
so complicated or intricate as to be hard to
understand or deal with: a complex problem.
Source: Dictionary.com
COMPLEX SYSTEMS
Complexity, a scientific theory which
asserts that some systems display
behavioral phenomena that are completely
inexplicable by any conventional analysis
of the systems’ constituent parts. These
phenomena, commonly referred to as
emergent behaviour, seem to occur in
many complex systems involving living
organisms, such as a stock market or the
human brain.
Source: John L. Casti, Encyclopædia Britannica
Network Science: Introduction
THE ROLE OF NETWORKS
Behind each complex
system there is a network,
that defines the interactions
between the component.
Network Science: Introduction
Keith Shepherd's "Sunday Best”. http://baseballart.com/2010/07/shades-of-greatness-a-story-that-needed-to-be-told/
The “Social Graph” behind Facebook
SOCIETY Factoid:
Network Science: Introduction
: departments
: consultants
: external experts
www.orgnet.com
STRUCTURE OF AN ORGANIZATION
Network Science: Introduction
Brain
Human Brain
has between
10-100 billion
neurons.
BRAIN Factoid:
Network Science: Introduction
The subtle financial networks
Network Science: Introduction
The not so subtle financial networks: 2011
Network Science: Introduction
Nodes:
Links:
http://ecclectic.ss.uci.edu/~drwhite/Movie
BUSINESS TIES IN US BIOTECH-INDUSTRY
Companies
Investment
Pharma
Research Labs
Public
Biotechnology
Collaborations
Financial
R&D
Network Science: Introduction
INTERNET
domain2
domain1
domain3
router
Network Science: Introduction
Drosophila
Melanogaster
Homo
Sapiens
In the generic networks shown, the points
represent the elements of each organism’s genetic
network, and the dotted lines show the
interactions between them.
HUMANS GENES
Network Science: Introduction
Complex systems
Made of many non-identical elements connected by
diverse interactions.
NETWORK
HUMANS GENES
Drosophila
Melanogaster
Homo
Sapiens
Network Science: Introduction
THE ROLE OF NETWORKS
Network Science: Introduction
Behind each system studied in complexity there is an intricate wiring
diagram, or a network, that defines the interactions between the
component.
We will never understand
complex system unless we
map out and understand
the networks behind them.
TWO FORCES HELPED THE
EMERGENCE OF NETWORK
SCIENCE
Section 5
Network Science: Introduction
Graph theory: 1735, Euler
Social Network Research: 1930s, Moreno
Communication networks/internet: 1960s
Ecological Networks: May, 1979.
THE HISTORY OF NETWORK ANALYSIS
Network Science: Introduction
THE HISTORY OF NETWORK ANALYSIS
Network Science: Introduction
(Random Networks in Graph Theory)
(Social Network)
The emergence of network maps:
Network Science: Introduction
THE EMERGENCE OF NETWORK SCIENCE
Movie Actor Network, 1998;
World Wide Web, 1999.
C elegans neural wiring diagram 1990
Citation Network, 1998
Metabolic Network, 2000;
Protein-protein Interaction (PPI) network, 2001
The universality of network characteristics:
Network Science: Introduction
THE EMERGENCE OF NETWORK SCIENCE
The architecture of networks
emerging in various domains of
science, nature, and technology are
more similar to each other than one
would have expected.
THE CHARACTERISTICS OF NETWORK
SCIENCE
Section 6
Network Science: Introduction
Interdisciplinary
Quantitative and Mathematical
Computational
Empirical
Network Science: Introduction
THE CHARACTERISTICS OF NETWORK SCIENCE
Interdisciplinary
Quantitative and Mathematical
Computational
Empirical, data driven
Network Science: Introduction
THE CHARACTERISTICS OF NETWORK SCIENCE
Interdisciplinary
Quantitative and Mathematical
Computational
Empirical
Network Science: Introduction
THE CHARACTERISTICS OF NETWORK SCIENCE
Interdisciplinary
Quantitative and Mathematical
Computational
Empirical
Network Science: Introduction
THE CHARACTERISTICS OF NETWORK SCIENCE
THE IMPACT OF NETWORK SCIENCE
Section 7
Network Science: Introduction
Google
Market Cap(2010 Jan 1):
$189 billion
Cisco Systems
networking gear Market
cap (Jan 1, 2919):
$112 billion
Facebook
market cap:
$50 billion
www.bizjournals.com/austin/news/2010/11/
15/facebooks... - Cached
Network Science: Introduction
ECONOMIC IMPACT
Reduces
Inflammation
Fever
Pain
Prevents
Heart attack
Stroke
Causes
Bleeding
Ulcer
Reduces the risk of
Alzheimer's Disease
COX2
Reduces the risk of
breast cancer
ovarian cancers
colorectal cancer
DRUG DESIGN, METABOLIC ENGINEERING:
Network Science: Introduction
DRUG DESIGN, METABOLIC ENGINEERING:
HUMAN DISEASE NETWORK
Network Science: Introduction
The network behind a military engagement
Thex
Predicting the H1N1 pandemic
Network Science: Introduction
Thex
In September 2010 the National Institutes of
Health awarded $40 million to researchers at
Harvard, Washington University in St. Louis,
the University of Minnesota and UCLA, to
develop the technologies that could
systematically map out brain circuits.
The Human Connectome Project (HCP) with
the ambitious goal to construct a map of the
complete structural and functional neural
connections in vivo within and across
individuals.
http://www.humanconnectomeproject.org/overview/
Network Science: Introduction
BRAIN RESEARCH
Barabasi Lab
Management
Barabasi Lab
Barabasi Lab
Introduction_to_Network Science_Riad.pdf
SCIENTIFIC IMPACT
Section 8
Network Science: Introduction
NETWORK SCIENCE The science of the 21st century
Network Science: Introduction
Years
Times
cited
SUMMARY
Section 9
Network Science: Introduction
Thex
Network Science: Introduction
NGRAMS Networks Awareness
Thex
If you were to understand the spread of diseases,
can you do it without networks?
If you were to understand the WWW structure,
searchability, etc, hopeless without invoking the
Web’s topology.
If you want to understand human diseases, it is
hopeless without considering the wiring
diagram of the cell.
Network Science: Introduction
MOST IMPORTANT Networks Really Matter
Some Graph Properties
Section 10
Network Science: Introduction
Degree distribution
P(k): probability that a
randomly chosen node
has degree k
Nk = # nodes with degree k
P(k) = Nk / N plot
DEGREE DISTRIBUTION
The maximum number of links a network
of N nodes can have is: Lmax =
N
2





 =
N(N −1)
2
A graph with degree L=Lmax is called a complete graph,
and its average degree is <k>=N-1
Network Science: Graph Theory
COMPLETE GRAPH
Most networks observed in real systems are sparse:
L << Lmax
or
<k> <<N-1.
WWW (ND Sample): N=325,729; L=1.4 106 Lmax=1012 <k>=4.51
Protein (S. Cerevisiae): N= 1,870; L=4,470 Lmax=107 <k>=2.39
Coauthorship (Math): N= 70,975; L=2 105 Lmax=3 1010 <k>=3.9
Movie Actors: N=212,250; L=6 106 Lmax=1.8 1013 <k>=28.78
(Source: Albert, Barabasi, RMP2002)
Network Science: Graph Theory
REAL NETWORKS ARE SPARSE
ADJACENCY MATRICES ARE SPARSE
Network Science: Graph Theory
bipartite graph (or bigraph) is a graph whose nodes can be divided
into two disjoint sets U and V such that every link connects a node in U to
one in V; that is, U and V are independent sets.
Examples:
Hollywood actor network
Collaboration networks
Disease network (diseasome)
BIPARTITE GRAPHS
Network Science: Graph Theory
Gene network
GENOME
PHENOME
DISEASOME
Disease network
Goh, Cusick, Valle, Childs, Vidal & Barabási, PNAS (2007)
GENE NETWORK – DISEASE NETWORK
Network Science: Graph Theory
HUMAN DISEASE NETWORK
Y.-Y. Ahn, S. E. Ahnert, J. P. Bagrow, A.-L. Barabási Flavor network and the principles
of food pairing , Scientific Reports 196, (2011).
Ingredient-Flavor Bipartite Network
Network Science: Graph Theory
Introduction_to_Network Science_Riad.pdf
Clustering coefficient
Section 11
Clustering coefficient:
what fraction of your neighbors are connected?
Node i with degree ki
Ci in [0,1]
Network Science: Graph Theory
CLUSTERING COEFFICIENT
Watts & Strogatz, Nature 1998.
Clustering coefficient:
what fraction of your neighbors are connected?
Node i with degree ki
Ci in [0,1]
Network Science: Graph Theory
CLUSTERING COEFFICIENT
Watts & Strogatz, Nature 1998.
Random Networks
Section 12
RANDOM NETWORK MODEL
The random network model
Section 12.1
Erdös-Rényi model (1960)
Connect with probability p
p=1/6 N=10
<k> ~ 1.5
Pál Erdös
(1913-1996)
Alfréd Rényi
(1921-1970)
RANDOM NETWORK MODEL
RANDOM NETWORK MODEL
Network Science: Random
Definition:
A random graph is a graph of N nodes where each
pair of nodes is connected by probability p.
RANDOM NETWORK MODEL
p=1/6
N=12
L=8 L=10 L=7
RANDOM NETWORK MODEL
p=0.03
N=100
The number of links is variable
Section 12.2
RANDOM NETWORK MODEL
p=1/6
N=12
L=8 L=10 L=7
Number of links in a random network
P(L): the probability to have exactly L links in a network of N nodes and probability p:
Network Science: Random Graphs
P(L) =
N
2






L










pL
(1− p)
N(N−1)
2
−L
The maximum number of links
in a network of N nodes.
Number of different ways we can choose
L links among all potential links.
Binomial distribution...
MATH TUTORIAL Binomial Distribution: The bottom line
Network Science: Random Graphs
http://keral2008.blogspot.com/2008/10/derivation-of-mean-and-variance-of.html
P(x) =
N
x





px
(1− p)N−x
< x >= Np
< x2
>= p(1− p)N + p2
N2
σx = (< k2
> − < k >2
)1/ 2
= [p(1− p)N]1/ 2
RANDOM NETWORK MODEL
P(L): the probability to have a network of exactly L links
Network Science: Random Graphs
P(L) =
N
2






L










pL
(1− p)
N(N−1)
2
−L
< L >= LP(L) = p
N(N −1)
2
L= 0
N(N−1)
2
∑
•The average number of links <L> in a random graph
•The standard deviation
σ2
= p(1− p)
N(N −1)
2
< k >= 2L/N = p(N −1)
Degree distribution
Section 12.3
DEGREE DISTRIBUTION OF A RANDOM GRAPH
Network Science: Random Graphs
As the network size increases, the distribution becomes increasingly narrow—we are
increasingly confident that the degree of a node is in the vicinity of <k>.
Select k
nodes from N-1 probability of
having k edges
probability of
missing N-1-k
edges
P(k) =
N −1
k





pk
(1− p)(N−1)−k
< k >= p(N −1) σk
2
= p(1− p)(N −1)
σk
< k >
=
1− p
p
1
(N −1)






1/2
≈
1
(N −1)1/ 2
DEGREE DISTRIBUTION OF A RANDOM GRAPH
Network Science: Random Graphs
P(k) =
N −1
k





pk
(1− p)(N−1)−k
< k >= p(N −1) p =
< k >
(N −1)
For large N and small k, we can use the following approximations:
N −1
k





 =
(N −1)!
k!(N −1− k)!
=
(N −1)(N −1−1)(N −1− 2)...(N −1− k +1)(N −1− k)!
k!(N −1− k)!
=
(N −1)k
k!
ln[(1− p)(N−1)−k
] = (N −1− k)ln(1−
< k >
N −1
) = −(N −1− k)
< k >
N −1
= − < k > (1−
k
N −1
) ≅ − < k >
(1− p)(N−1)−k
= e−<k>
P(k) =
N −1
k





pk
(1− p)(N−1)−k
=
(N −1)k
k!
pk
e−<k>
=
(N −1)k
k!
< k >
N −1






k
e−<k>
= e−<k> < k >k
k!
ln 1+ x
( ) =
−1
( )
n+1
n
n=1
∞
∑ xn
= x −
x2
2
+
x3
3
− ... for x ≤1
POISSON DEGREE DISTRIBUTION
Network Science: Random Graphs
P(k) =
N −1
k





pk
(1− p)(N−1)−k
< k >= p(N −1) p =
< k >
(N −1)
For large N and small k, we arrive to the Poisson distribution:
P(k) = e−< k> < k >k
k!
DEGREE DISTRIBUTION OF A RANDOM GRAPH
Network Science: Random Graphs
P(k)
k
P(k) = e−<k> < k >k
k!
<k>=50
DEGREE DISTRIBUTION OF A RANDOM NETWORK
Exact Result
-binomial distribution-
Large N limit
-Poisson distribution-
Probability
Distribution
Function
(PDF)
Real Networks are not Poisson
Section 12.4
Section 12.5 Maximum and minimum degree
kmax=1,185
<k>=1,000, N=109
P k e
k
k
( )
!
min
k
k
k
k
0
min
∑
=
〈 〉
−〈 〉
=
.
<k>=1,000, N=109
kmin=816
NO OUTLIERS IN A RANDOM SOCIETY
Network Science: Random Graphs
The most connected individual has degree kmax~1,185
The least connected individual has degree kmin ~ 816
The probability to find an individual with degree k>2,000 is 10-27. Hence the chance of
finding an individual with 2,000 acquaintances is so tiny that such nodes are virtually
inexistent in a random society.
a random society would consist of mainly average individuals, with everyone with
roughly the same number of friends.
It would lack outliers, individuals that are either highly popular or recluse.
P(k) = e−<k> < k >k
k!
FACING REALITY: Degree distribution of real networks
P(k) = e−<k> < k >k
k!
The evolution of a random network
Section 13
Introduction_to_Network Science_Riad.pdf
<k>
EVOLUTION OF A RANDOM NETWORK
disconnected nodes NETWORK.
How does this transition happen?
<kc>=1 (Erdos and Renyi, 1959)
EVOLUTION OF A RANDOM NETWORK
disconnected nodes NETWORK.
The fact that at least one link per node is necessary to have a giant component is
not unexpected. Indeed, for a giant component to exist, each of its nodes must be
linked to at least one other node.
It is somewhat unexpected, however that one link is sufficient for the emergence of
a giant component.
It is equally interesting that the emergence of the giant cluster is not gradual, but
follows what physicists call a second order phase transition at <k>=1.
Section 13.1
Section 13.1
<k>
EVOLUTION OF A RANDOM NETWORK
disconnected nodes NETWORK.
How does this transition happen?
Phase transitions in complex systems I: Magnetism
Phase transitions in complex systems I: liquids
Water Ice
CLUSTER SIZE DISTRIBUTION
p(s) =
e−< k>s
(< k > s)s−1
s
!
Probability that a randomly
selected node belongs to a
cluster of size s:
Network Science: Random Graphs
At the critical point <k>=1
The distribution of cluster sizes
at the critical point, displayed in
a log-log plot. The data
represent an average over
1000 systems of sizes
The dashed line has a slope of
−τn = −2.5
Derivation in Newman, 2010
k
s−1
= exp[(s−1)ln k ]
p(s) =
ss−1
s
!
e− k s+(s−1)ln k
s
!= 2πs
s
e






s
p(s) ~ s−3 / 2
e−( k −1)s+(s−1)ln k
p(s) ~ s−3 / 2
I:
Subcritical
<k> < 1
III:
Supercritical
<k> > 1
IV:
Connected
<k> > ln N
II:
Critical
<k> = 1
<k>=0.5 <k>=1 <k>=3 <k>=5
N=100
<k>
I:
Subcritical
<k> < 1
p < pc=1/N
<k>
No giant component.
N-L isolated clusters, cluster size distribution is exponential
The largest cluster is a tree, its size ~ ln N
p(s) ~ s−3 / 2
e−( k −1)s+(s−1)ln k
II:
Critical
<k> = 1
p=pc=1/N
<k>
Unique giant component: NG~ N2/3
contains a vanishing fraction of all nodes, NG/N~N-1/3
Small components are trees, GC has loops.
Cluster size distribution: p(s)~s-3/2
A jump in the cluster size:
N=1,000 ln N~ 6.9; N2/3~95
N=7 109 ln N~ 22; N2/3~3,659,250
<k>=3
<k>
Unique giant component: NG~ (p-pc)N
GC has loops.
Cluster size distribution: exponential
III:
Supercritical
<k> > 1
p > pc=1/N
p(s) ~ s−3 / 2
e−( k −1)s+(s−1)ln k
IV:
Connected
<k> > ln N
p > (ln N)/N
<k>=5
<k>
Only one cluster: NG=N
GC is dense.
Cluster size distribution: None
Introduction_to_Network Science_Riad.pdf
Network evolution in graph theory
A graph has a given property Q if the probability of having Q ap-
proaches 1 as N ∞ . That is, f or a given z either almost every graph
has the property Q or almost no graph has it. For example, f or z less
p =< k > /(N −1)
Introduction_to_Network Science_Riad.pdf
Real networks are supercritical
Section 13.2
Section 13.2
Small worlds
Section 13.3
Frigyes Karinthy, 1929
Stanley Milgram, 1967
Peter
Jane
Sarah
Ralph
SIX DEGREES small worlds
SIX DEGREES 1929: Frigyes Kartinthy
Frigyes Karinthy (1887-1938)
Hungarian Writer
Network Science: Random Graphs
“Look, Selma Lagerlöf just won the Nobel Prize for Literature,
thus she is bound to know King Gustav of Sweden, after all he is
the one who handed her the Prize, as required by tradition. King
Gustav, to be sure, is a passionate tennis player, who always
participates in international tournaments. He is known to have
played Mr. Kehrling, whom he must therefore know for sure, and
as it happens I myself know Mr. Kehrling quite well.”
"The worker knows the manager in the shop, who knows Ford;
Ford is on friendly terms with the general director of Hearst
Publications, who last year became good friends with Arpad
Pasztor, someone I not only know, but to the best of my
knowledge a good friend of mine. So I could easily ask him to
send a telegram via the general director telling Ford that he
should talk to the manager and have the worker in the shop
quickly hammer together a car for me, as I happen to need one."
1929: Minden másképpen van (Everything is Different)
Láncszemek (Chains)
SIX DEGREES 1967: Stanley Milgram
Network Science: Random Graphs
HOW TO TAKE PART IN THIS STUDY
1. ADD YOUR NAME TO THE ROSTER AT THE BOTTOM OF THIS SHEET, so that
the next person who receives this letter will know who it came from.
2. DETACH ONE POSTCARD. FILL IT AND RETURN IT TO HARVARD UNIVERSITY.
No stamp is needed. The postcard is very important. It allows us to keep track of the
progress of the folder as it moves toward the target person.
3. IF YOU KNOW THE TARGET PERSON ON A PERSONAL BASIS, MAIL THIS
FOLDER DIRECTLY TO HIM (HER). Do this only if you have previously met the target
person and know each other on a first name basis.
4. IF YOU DO NOT KNOW THE TARGET PERSON ON A PERSONAL BASIS, DO
NOT TRY TO CONTACT HIM DIRECTLY. INSTEAD, MAIL THIS FOLDER (POST
CARDS AND ALL) TO A PERSONAL ACQUAINTANCE WHO IS MORE LIKELY THAN
YOU TO KNOW THE TARGET PERSON. You may send the folder to a friend, relative or
acquaintance, but it must be someone you know on a first name basis.
SIX DEGREES 1967: Stanley Milgram
Network Science: Random Graphs
SIX DEGREES 1991: John Guare
Network Science: Random Graphs
"Everybody on this planet is separated by only six other people.
Six degrees of separation. Between us and everybody else on
this planet. The president of the United States. A gondolier in
Venice…. It's not just the big names. It's anyone. A native in a
rain forest. A Tierra del Fuegan. An Eskimo. I am bound to
everyone on this planet by a trail of six people. It's a profound
thought. How every person is a new door, opening up into other
worlds."
WWW: 19 DEGREES OF SEPARATION
Image by Matthew Hurst
Blogosphere Network Science: Random Graphs
DISTANCES IN RANDOM GRAPHS
Random graphs tend to have a tree-like topology with almost constant node degrees.
Network Science: Random Graphs
dmax =
logN
log k
N =1+ k + k
2
+ ...+ k
dmax
=
k
dmax +1
−1
k −1
≈ k
dmax
DISTANCES IN RANDOM GRAPHS
Network Science: Random Graphs
dmax =
logN
log k
< d >=
logN
log k
We will call the small world phenomena the property that the average path
length or the diameter depends logarithmically on the system size.
Hence, ”small” means that ⟨d⟩ is proportional to log N, rather than N.
In most networks this offers a better approximation to the average distance
between two randomly chosen nodes, ⟨d⟩, than to dmax .
The 1/log⟨k⟩ term implies that denser the network, the smaller will be the
distance between the nodes.
Network Science: Graph Theory
Average Degree
Given the huge differences in scope, size, and average degree, the agreement is excellent.
DISTANCES IN RANDOM GRAPHS compare with real data
Why are small worlds surprising? Suprising compared to what?
Network Science: Random Graphs
Three, Four or Six Degrees?
For the globe’s social networks:
⟨k⟩ ≃ 103
N ≃ 7 × 109 for the world’s population.
< d >=
ln(N)
ln k
= 3.28
Image by Matthew Hurst
Blogosphere
Clustering coefficient
Section 9
Since edges are independent and have the same probability p,
< Li >≅ p
ki (ki −1)
2
•The clustering coefficient of random graphs is small.
•For fixed degree C decreases with the system size N.
•C is independent of a node’s degree k.
Ci ≡
2 < Li >
ki (ki −1)
CLUSTERING COEFFICIENT
C decreases with the system size N.
C is independent of a node’s degree k.
Network Science: Random Graphs
CLUSTERING COEFFICIENT
Image by Matthew Hurst
Blogosphere
Watts-Strogatz Model
Real networks are not random
Section 10
As quantitative data about real networks became available, we can
compare their topology with the predictions of random graph theory.
Note that once we have N and <k> for a random network, from it we can derive every
measurable property. Indeed, we have:
Average path length:
Clustering Coefficient:
Degree Distribution:
< lrand >≈
logN
log k
ARE REAL NETWORKS LIKE RANDOM GRAPHS?
Network Science: Random Graphs
P(k) = e−<k> < k >k
k!
Real networks have short distances
like random graphs.
Prediction:
PATH LENGTHS IN REAL NETWORKS
Network Science: Random Graphs
< d >=
logN
log k
Prediction:
Crand underestimates with orders of magnitudes
the clustering coefficient of real networks.
CLUSTERING COEFFICIENT
Network Science: Random Graphs
P(k) ≈ k−γ
Prediction:
Data:
THE DEGREE DISTRIBUTION
Network Science: Random Graphs
P(k) = e−<k> < k >k
k!
As quantitative data about real networks became available, we can
compare their topology with the predictions of random graph theory.
Note that once we have N and <k> for a random network, from it we can derive every
measurable property. Indeed, we have:
Average path length:
Clustering Coefficient:
Degree Distribution:
< lrand >≈
logN
log k
ARE REAL NETWORKS LIKE RANDOM GRAPHS?
Network Science: Random Graphs
P(k) = e−<k> < k >k
k!
(B) Most important: we need to ask ourselves, are real networks random?
The answer is simply: NO
There is no network in nature that we know of that would be
described by the random network model.
IS THE RANDOM GRAPH MODEL RELEVANT TO REAL SYSTEMS?
Network Science: Random Graphs
It is the reference model for the rest of the class.
It will help us calculate many quantities, that can then be compared to the real
data, understanding to what degree is a particular property the result of some
random process.
Patterns in real networks that are shared by a large number of real networks,
yet which deviate from the predictions of the random network model.
In order to identify these, we need to understand how would a particular
property look like if it is driven entirely by random processes.
While WRONG and IRRELEVANT, it will turn out to be extremly USEFUL!
IF IT IS WRONG AND IRRELEVANT, WHY DID WE DEVOT TO IT A FULL CLASS?
Network Science: Random Graphs
Scale-free property
Section 1
Nodes: WWW documents
Links: URL links
Over 3 billion documents
ROBOT: collects all URL’s
found in a document and
follows them recursively
WORLD WIDE WEB
R. Albert, H. Jeong, A-L Barabasi, Nature, 401 130 (1999).
Power laws and scale-free networks
Section 2
Nodes: WWW documents
Links: URL links
Over 3 billion documents
ROBOT: collects all URL’s
found in a document and
follows them recursively
R. Albert, H. Jeong, A-L Barabasi, Nature, 401 130 (1999).
WORLD WIDE WEB
Network Science: Scale-Free Property
Discrete vs. Continuum formalism
Network Science: Scale-Free Property
Discrete Formalism
As node degrees are always positive integers, the
discrete formalism captures the probability that
a node has exactly k links:
Continuum Formalism
In analytical calculations it is often convenient to
assume that the degrees can take up any positive
real value:
INTERPRETATION:
80/20 RULE
Vilfredo Federico Damaso Pareto (1848 – 1923), Italian economist, political scientist and philosopher,
who had important contributions to our understanding of income distribution and to the analysis of individuals choices.
A number of fundamental principles are named after him, like Pareto efficiency, Pareto distribution (another name for a
power-law distribution), the Pareto principle (or 80/20 law).
Hubs
Section 3
The difference between a power law and an exponential distribution
The difference between a power law and an exponential distribution
Let us use the WWW to illustrate the properties of the high-k regime.
The probability to have a node with k~100 is
•About in a Poisson distribution
•About if pk follows a power law.
•Consequently, if the WWW were to be a random network, according to the
Poisson prediction we would expect 10-18 k>100 degree nodes, or none.
•For a power law degree distribution, we expect about k>100
degree nodes
Network Science: Scale-Free Property
Finite scale-free networks
All real networks are finite let us explore its consequences.
We have an expected maximum degree, kmax
Estimating kmax
P(k)dk
kmax
∞
∫ ≈
1
N
kmax = kmin N
1
γ −1
Why: the probability to have a node larger than kmax should not
exceed the prob. to have one node, i.e. 1/N fraction of all
nodes
P(k)dk
kmax
∞
∫ = (γ −1)kmin
γ −1
k−γ
dk
kmax
∞
∫ =
(γ −1)
(−γ +1)
kmin
γ −1
k−γ +1

 
kmax
∞
=
kmin
γ −1
kmax
γ −1
≈
1
N
The size of the biggest hub
Finite scale-free networks
kmax = kmin N
1
γ −1
The size of the biggest hub
Finite scale-free networks
Expected maximum degree, kmax
kmax = kmin N
1
γ −1
•kmax, increases with the size of the network
the larger a system is, the larger its biggest hub
•For γ>2 kmax increases slower than N
the largest hub will contain a decreasing fraction of links as N increases.
•For γ=2 kmax~N.
The size of the biggest hub is O(N)
•For γ<2 kmax increases faster than N: condensation phenomena
the largest hub will grab an increasing fraction of links. Anomaly!
Finite scale-free networks
kmax = kmin N
1
γ −1
The size of the largest hub
The meaning of scale-free
Section 4
Definition:
Networks with a power law tail in their degree distribution are called
‘scale-free networks’
Where does the name come from?
Critical Phenomena and scale-invariance
(a detour)
Slides after Dante R. Chialvo
Scale-free networks: Definition
Network Science: Scale-Free Property
Phase transitions in complex systems I: Magnetism
T = 0.99 Tc
T = 0.999 Tc
ξ ξ
T = Tc T = 1.5 Tc T = 2 Tc
Network Science: Scale-Free Property
At T = Tc:
correlation length
diverges
Fluctuations emerge at
all scales:
scale-free behavior
Scale-free behavior in space
Network Science: Scale-Free Property
• Correlation length diverges at the critical point: the
whole system is correlated!
• Scale invariance: there is no characteristic scale for
the fluctuation (scale-free behavior).
• Universality: exponents are independent of the
system’s details.
CRITICAL PHENOMENA
Network Science: Scale-Free Property
C =
1
k−γ
dk
kmin
∞
∫
= (γ −1)kmin
γ −1
P(k) = Ck−γ
k = [kmin ,∞) P(k)
kmin
∞
∫ dk = 1
P(k) = (γ −1)kmin
γ −1
k−γ
Divergences in scale-free distributions
Network Science: Scale-Free Property
< km
>= km
P(k)dk
kmin
∞
∫ < km
>= (γ −1)kmin
γ −1
km−γ
dk
kmin
∞
∫ =
(γ −1)
(m−γ +1)
kmin
γ −1
km−γ +1

 
kmin
∞
If m-γ+1<0: < km
>= −
(γ −1)
(m−γ +1)
kmin
m
If m-γ+1>0, the integral diverges.
For a fixed γ this means that all moments with m>γ-1 diverge.
< km
>= (γ −1)kmin
γ −1
km−λ
dk
kmin
∞
∫ =
(γ −1)
(m−γ +1)
kmin
γ −1
km−γ +1

 
kmin
∞
For a fixed λ this means all moments m>γ-1 diverge.
Many degree exponents are smaller
than 3
<k2> diverges in the N ∞ limit!!!
DIVERGENCE OF THE HIGHER MOMENTS
Network Science: Scale-Free Property
The meaning of scale-free
The meaning of scale-free
universality
Section 5
(Faloutsos, Faloutsos and Faloutsos, 1999)
Nodes: computers, routers
Links: physical lines
INTERNET BACKBONE
Network Science: Scale-Free Property
Network Science: Scale-Free Property
(γ
γ
γ
γ = 3)
(S. Redner, 1998)
P(k) ~k-γ
γ
γ
γ
1736 PRL papers (1988)
SCIENCE CITATION INDEX
Nodes: papers
Links: citations
578...
25
H.E. Stanley,...
Network Science: Scale-Free Property
SCIENCE COAUTHORSHIP
M: math
NS: neuroscience
Nodes: scientist (authors)
Links: joint publication
(Newman, 2000, Barabasi et al 2001)
Network Science: Scale-Free Property
Nodes: online user
Links: email contact
Ebel, Mielsch, Bornholdtz, PRE 2002.
Kiel University log files
112 days, N=59,912 nodes
Pussokram.com online community;
512 days, 25,000 users.
Holme, Edling, Liljeros, 2002.
ONLINE COMMUNITIES
ONLINE COMMUNITIES
Twitter:
Jake Hoffman, Yahoo,
Facebook
Brian Karrer, Lars Backstrom, Cameron Marlowm 2011
Organisms from all three
domains of life are scale-free!
H. Jeong, B. Tombor, R. Albert, Z.N. Oltvai, and A.L. Barabasi, Nature, 407 651 (2000)
Archaea Bacteria Eukaryotes
2
2
2
2
.
.
)
(
)
(
−
−
−
−
−
−
−
−
≈
≈
≈
≈
≈
≈
≈
≈
k
k
P
k
k
P
out
in
METABOLIC NETWORK
Network Science: Scale-Free Property
Nodes: actors
Links: cast jointly
N = 212,250 actors
〈k〉 = 28.78
P(k) ~k-γ
Days of Thunder (1990)
Far and Away (1992)
Eyes Wide Shut (1999)
γ=2.3
ACTOR NETWORK
Nodes: people (Females; Males)
Links: sexual relationships
Liljeros et al. Nature 2001
4781 Swedes; 18-74;
59% response rate.
SWEDISH SE-WEB
Network Science: Scale-Free Property
Introduction_to_Network Science_Riad.pdf
Not all networks are scale-free
•Networks appearing in material science,
like the network describing the bonds
between the atoms in crystalline or
amorphous materials, where each node has
exactly the same degree.
•The neural network of the C.elegans worm.
•The power grid, consisting of generators
and switches connected by transmission
lines
Ultra-small property
Section 6
DISTANCES IN RANDOM GRAPHS
Random graphs tend to have a tree-like topology with almost constant node degrees.
• nr. of first neighbors:
• nr. of second neighbors:
•nr. of neighbours at distance d:
• estimate maximum distance:
k
log
N
log
lmax =
=
=
=
∑
∑
∑
∑
=
=
=
=
=
=
=
=
+
+
+
+
max
l
1
l
i
N
k
1
k
N1 ≅
≅
≅
≅
2
2 k
N ≅
≅
≅
≅
Nd ≅ k
d
Network Science: Scale-Free Property
Distances in scale-free networks
Size of the biggest hub is of order O(N). Most nodes can be connected within two layers
of it, thus the average path length will be independent of the system size.
The average path length increases slower than logarithmically. In a random network all
nodes have comparable degree, thus most paths will have comparable length. In a
scale-free network the vast majority of the path go through the few high degree hubs,
reducing the distances between nodes.
Some key models produce γ=3, so the result is of particular importance for them. This
was first derived by Bollobas and collaborators for the network diameter in the context of
a dynamical model, but it holds for the average path length as well.
The second moment of the distribution is finite, thus in many ways the network behaves
as a random network. Hence the average path length follows the result that we derived
for the random network model earlier.
Cohen, Havlin Phys. Rev. Lett. 90, 58701(2003); Cohen, Havlin and ben-Avraham, in Handbook of Graphs and Networks, Eds. Bornholdt and
Shuster (Willy-VCH, NY, 2002) Chap. 4; Confirmed also by: Dorogovtsev et al (2002), Chung and Lu (2002); (Bollobas, Riordan, 2002; Bollobas,
1985; Newman, 2001
Ultra
Sm all
W orld
Small
World
SMALL WORLD BEHAVIOR IN SCALE-FREE NETWORKS
kmax = kmin N
1
γ −1
Why are small worlds surprising? Suprising compared to what?
Network Science: Random Graphs January 31, 2011
Distances in scale-free networks
SMALL WORLD BEHAVIOR IN SCALE-FREE NETWORKS
We are always close to the hubs
" it's always easier to find someone who knows
a famous or popular figure than some run-the-
mill, insignificant person.”
(Frigyes Karinthy, 1929)
The role of the degree exponent
Section 7
SUMMARY OF THE BEHAVIOR OF SCALE-FREE NETWORKS
Distances in scale-free networks
Graphicality: No large networks for γ<2
kmax = kmin N
1
γ −1
In scale-free networks: For γ<2: 1/(γ-2)>1
Kmax = Kmin N
1
γ −1
In order to document a scale-free networks, we need 2-3 orders of magnitude scaling.
That is, Kmax~ 103
However, that constrains on the system size we require to document it.
For example, to measure an exponent γ=5,we need to maximum degree a system size
of the order of
N =
Kmax
Kmin






γ−1
≈108
Onella et al. PNAS 2007
N=4.6x106
γ=8.4
Mobile Call
Network
Why don’t we see networks with exponents in the range of γ=4,5,6, etc?
Network Science: Scale-Free Property
PLOTTING POWER LAWS
ADVANCED TOPICS 4.B
2,800 Y2H interactions
4,100 binary LC interactions
(HPRD, MINT, BIND, DIP, MIPS)
Rual et al. Nature 2005; Stelze et al. Cell 2005
HUMAN INTERACTION NETWORK
Network Science: Scale-Free Property
(linear scale)
Network Science: Scale-Free Property
(linear scale)
P(k) ~ (k+k0)-γ
k0 = 1.4, γ=2.6.
HUMAN INTERACTION DATA BY RUAL ET AL.
Network Science: Scale-Free Property
COMMON MISCONCEPTIONS
Network Science: Scale-Free Property
Generating networks with a pre-
defined pk
Section 8
Configuration model
(1) Degree sequence: Assign a degree to each node,
represented as stubs or half-links. The degree sequence
is either generated analytically from a preselected
distribution (Box 4.5), or it is extracted from the
adjacency matrix of a real network. We must start from
an even number of stubs, otherwise we will be left with
unpaired stubs. (2) Network assembly: Randomly
select a stub pair and connect them. Then randomly
choose another pair from the remaining stubs and
connect them. This procedure is repeated until all stubs
are paired up. Depending on the order in which the
stubs were chosen, we obtain different networks. Some
networks include cycles (2a), others self-edges (2b) or
multi-edges (2c). Yet, the expected number of self- and
multi-edges goes to zero in the limit.
Degree Preserving randomization
Hidden parameter model
Hidden parameter model
Start with N isolated nodes and assign to each node a
“hidden parameter” η , which can be randomly selected
from a ρ(η) distribution. We next connect each node pair
with probability
For example, the figure shows the probability to connect
nodes (1,3) and (3,4). After connecting the nodes, we end
up with
the networks shown in (b) or (c), representing two
independent realizations generated by the same hidden
parameter sequence (a). The expected number of links in
the obtained network is
Decision tree
Case Study: PPI Network
Something to keep in mind
summary
Section 9
Section 9
Barabàsi-Albert (BA) model
Section 1
Section 1
Hubs represent the most striking difference between a random and a
scale-free network. Their emergence in many real systems raises
several fundamental questions:
•Why does the random network model of Erdős and Rényi fail to
reproduce the hubs and the power laws observed in many real
networks?
• Why do so different systems as the WWW or the cell converge to a
similar scale-free architecture?
Growth and preferential attachment
Section 2
networks expand through the addition
of new nodes
Barabási & Albert, Science 286, 509 (1999)
BA MODEL: Growth
ER model:
the number of nodes, N, is fixed (static models)
New nodes prefer to connect to the more connected nodes
Barabási & Albert, Science 286, 509 (1999) Network Science: Evolving Network Models
BA MODEL: Preferential attachment
ER model: links are added randomly to the network
Barabási & Albert, Science 286, 509 (1999) Network Science: Evolving Network Models
Section 2: Growth and Preferential Sttachment
The random network model differs from real networks in two important
characteristics:
Growth: While the random network model assumes that the number of nodes is
fixed (time invariant), real networks are the result of a growth process that
continuously increases.
Preferential Attachment: While nodes in random networks randomly choose their
interaction partner, in real networks new nodes prefer to link to the more connected
nodes.
The Barabási-Albert model
Section 3
Barabási & Albert, Science 286, 509 (1999)
P(k) ~k-3
(1) Networks continuously expand by the
addition of new nodes
WWW : addition of new documents
GROWTH:
add a new node with m links
PREFERENTIAL ATTACHMENT:
the probability that a node connects to a node
with k links is proportional to k.
(2) New nodes prefer to link to highly
connected nodes.
WWW : linking to well known sites
Network Science: Evolving Network Models
Origin of SF networks: Growth and preferential attachment
j
j
i
i
k
k
k
Σ
=
Π )
(
Section 4
Introduction_to_Network Science_Riad.pdf
The structure and function of
complex networks
Section 3
Section 4
(a) A food web of predator-prey interactions between species in a
freshwater lake [272]. Picture courtesy of Neo Martinez and
Richard Williams.
(b) The network of collaborations between scientists at a private
research institution [171].
(c) A network of sexual contacts between individuals in the study by
Potterat et al. [342].
Degree Correlation
Section 3
Section 4
Protein interaction map of yeast
Hubs Avoiding hubs
Celebrity Couples
Hubs Dating Hubs
The probability that nodes with degrees k and
k′ link to each other:
Section 4 Associativity and Dissociativity
Section 4 Associativity and Dissociativity
Politics is Never Neutral
The network behind the US political blogosphere illustrates the presence of associative mixing, as used in sociology,
nodes of similar caracteristics tend to link to each other.
Network Robustness
Section 3
Section 4
“Robust” comes from the latin Quercus Robur,
meaning oak, the symbol of strength and
longevity in the ancient world.
The tree in the figure stands near the Hungarian
village Diosviszlo and is documented at
www.dendromania.
hu, a site that catalogs Hungary's oldest
and largest trees.
Image courtesy of Gyorgy Posfai.
Section 4
Breakdown Thresholds Under Random
Failures and Attacks
The table shows the estimated fc for random
node failures (second column) and attacks
(fourth column) for ten reference networks.
The third column (randomized network)
offers fc for a network whose N and L
coincides with the original network, but
whose nodes are connected randomly to
each other (randomized network.
For most networks fc for random failures
exceeds fc for the corresponding randomized
network, indicating that these networks
display enhanced robustness. Three
networks lack
this property: the power grid, a consequence
of the fact that its degree distribution is
exponential and the actor and the citation
networks, which have a very high ⟨k⟩,
diminishing the role of the high ⟨k2⟩.
Communities
Section 3
Section 4
Communities in Belgium
Communities extracted from the call pattern
of the consumers of the largest Belgian
mobile phone company. The network has
about two million mobile phone users. The
nodes correspond to communities, the size
of each node being proportional to the
number of individuals in the corresponding
community.
The color of each community on a red–green
scale represents the language spoken in the
particular community, red for French and
green for Dutch. Only communities of more
than 100 individuals are shown. The
community that connects the two main
clusters consists of several smaller
communities with less obvious language
separation, capturing the culturally mixed
Brussels, the country’s capital.
Section 4
HIERARCHICAL CLUSTERING
AGGLOMERATIVE PROCEDURES: THE
RAVASZ ALGORITHM
Step 1: Define the Similarity Matrix
Step 2: Decide Group Similarity
Step 3: Apply Hierarchical Clustering
Step 4: Dendrogram
DIVISIVE PROCEDURES: THE GIRVAN-
NEWMAN ALGORITHM
Step 1: Define Centrality
Step 2: Hierarchical Clustering
Section 4
Here Θ(x) is the Heaviside step function,
which is zero for x≤0 and one for x>0; J(i, j) is
the number of common neighbors of node i
and j, to which we add one (+1) if there is a
direct link between i and j; min(ki,kj) is the
smaller of the degrees ki and kj.
Topological Overlap Matrix
Section 4
Cluster Similarity
Section 4
Centrality Measure
Centrality Measures
Divisive algorithms require a centrality measure that is high for nodes that belong to different communities and is
low for node pairs in the same community. Two frequently used measures can achieve this:
(a) Link Betweenness
Link betweenness captures the role of each link in information transfer. Hence xij is proportional to the number of
shortest paths between all node pairs that run along the link (i,j). Consequently, inter-community links, like the
central link in the figure with xij =0.57, have large betweenness.
(b) Random-Walk Betweenness
A pair of nodes m and n are chosen at random. A walker starts at m, following each adjacent link with equal
probability until it reaches n. Random walk betweenness xij is the probability that the link i→j was crossed by the
walker after averaging over all possible choices for the starting nodes m and n.
Section 4
The Girvan-Newman Algorithm
The partition’s modularity is
obtained by summing over all
nc communities
Section 4
Network Science
an interactive textbook
barabasi.com/NetworkScienceBook/
facebook.com/NetworkScienceBook

More Related Content

PDF
Network Science: Theory, Modeling and Applications
PDF
Network literacy-high-res
PPTX
WhatIsNetworkScience2 THE ONLY ONE024Def.pptx
PDF
Dynamical Processes on Complex Networks 1st Edition Alain Barrat
PDF
Class 1_ Introduction.pdf
PDF
Interpretation of the biological knowledge using networks approach
PPTX
Introduction to a fabulou and wondrouss.pptx
PPTX
WhatIsGraphMachineLearni FFFF ng2024.pptx
Network Science: Theory, Modeling and Applications
Network literacy-high-res
WhatIsNetworkScience2 THE ONLY ONE024Def.pptx
Dynamical Processes on Complex Networks 1st Edition Alain Barrat
Class 1_ Introduction.pdf
Interpretation of the biological knowledge using networks approach
Introduction to a fabulou and wondrouss.pptx
WhatIsGraphMachineLearni FFFF ng2024.pptx

Similar to Introduction_to_Network Science_Riad.pdf (20)

PDF
Statistical_mechanics_of_complex_network.pdf
PDF
Descobrindo o tesouro escondido nos seus dados usando grafos.
PDF
Network Biology: A paradigm for modeling biological complex systems
PDF
Complexity Número especial da Nature Physics Insight sobre complexidade
PPTX
Small Worlds Social Graphs Social Media
PDF
Complexity Play&Learn
PDF
Legal Analytics Course - Class 11 - Network Analysis and Law - Professors Dan...
PPTX
Map history-networks-shorter
PPT
An Introduction to Network Theory
PDF
1 dorogovtsev
PDF
1 dorogovtsev
PDF
Complex Networks Principles Methods and Applications 1st Edition Vito Latora
PPTX
WhatIsNetwork AND WAHT IS THATScience.pptx
PDF
Network analysis for computational biology
PDF
Complex Networks Principles Methods And Applications Vito Latora
PPT
Socialnetworkanalysis (Tin180 Com)
PPT
Scott Complex Networks
PPTX
WIDS 2021--An Introduction to Network Science
PDF
Network Analysis and Law: Introductory Tutorial @ Jurix 2011 Meeting (Vienna)
Statistical_mechanics_of_complex_network.pdf
Descobrindo o tesouro escondido nos seus dados usando grafos.
Network Biology: A paradigm for modeling biological complex systems
Complexity Número especial da Nature Physics Insight sobre complexidade
Small Worlds Social Graphs Social Media
Complexity Play&Learn
Legal Analytics Course - Class 11 - Network Analysis and Law - Professors Dan...
Map history-networks-shorter
An Introduction to Network Theory
1 dorogovtsev
1 dorogovtsev
Complex Networks Principles Methods and Applications 1st Edition Vito Latora
WhatIsNetwork AND WAHT IS THATScience.pptx
Network analysis for computational biology
Complex Networks Principles Methods And Applications Vito Latora
Socialnetworkanalysis (Tin180 Com)
Scott Complex Networks
WIDS 2021--An Introduction to Network Science
Network Analysis and Law: Introductory Tutorial @ Jurix 2011 Meeting (Vienna)
Ad

Recently uploaded (20)

PPTX
SAP 2 completion done . PRESENTATION.pptx
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PDF
Mega Projects Data Mega Projects Data
PDF
Introduction to Data Science and Data Analysis
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PDF
Fluorescence-microscope_Botany_detailed content
PPT
ISS -ESG Data flows What is ESG and HowHow
PDF
Lecture1 pattern recognition............
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PPT
Quality review (1)_presentation of this 21
PPTX
Market Analysis -202507- Wind-Solar+Hybrid+Street+Lights+for+the+North+Amer...
PDF
Introduction to the R Programming Language
PDF
Clinical guidelines as a resource for EBP(1).pdf
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PPTX
Database Infoormation System (DBIS).pptx
PPTX
climate analysis of Dhaka ,Banglades.pptx
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PPTX
IB Computer Science - Internal Assessment.pptx
SAP 2 completion done . PRESENTATION.pptx
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Mega Projects Data Mega Projects Data
Introduction to Data Science and Data Analysis
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
Fluorescence-microscope_Botany_detailed content
ISS -ESG Data flows What is ESG and HowHow
Lecture1 pattern recognition............
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
IBA_Chapter_11_Slides_Final_Accessible.pptx
Quality review (1)_presentation of this 21
Market Analysis -202507- Wind-Solar+Hybrid+Street+Lights+for+the+North+Amer...
Introduction to the R Programming Language
Clinical guidelines as a resource for EBP(1).pdf
Miokarditis (Inflamasi pada Otot Jantung)
Database Infoormation System (DBIS).pptx
climate analysis of Dhaka ,Banglades.pptx
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
IB Computer Science - Internal Assessment.pptx
Ad

Introduction_to_Network Science_Riad.pdf

  • 1. Introduction to Network Science Riadh DHAOU based on the lecture of Albert-László Barabási
  • 2. FROM SADDAM HUSSEIN TO NETWORK THEORY Section 2: FROM SADDAM HUSSEIN TO NETWORK THEORY Network Science: Introduction
  • 3. Network Science: Introduction A SIMPLE STORY (1) The fate of Saddam and network science Network Science: Introduction
  • 4. Thex Network Science: Introduction The capture of Saddam Hussein: shows the strong predictive power of networks. underlies the need to obtain accurate maps of the networks we aim to study; and the often heroic difficulties we encounter during the mapping process. demonstrates the remarkable stability of these networks: The capture of Hussein was not based on fresh intelligence, but rather on his pre-invasion social links, unearthed from old photos stacked in his family album. shows that the choice of network we focus on makes a huge difference: the hierarchical tree, that captured the official organization of the Iraqi government, was of no use when it came to Saddam Hussein's whereabouts. A SIMPLE STORY (1) The fate of Saddam and network science
  • 5. VULNERABILITY DUE TO INTERCONNECTIVITY Section 3 VULNERABILITY DUE TO INTERCONNECTIVITY Network Science: Introduction
  • 6. Thex Network Science: Introduction A SIMPLE STORY (2): August 15, 2003 blackout. August 14, 2003: 9:29pm EDT 20 hours before August 15, 2003: 9:14pm EDT 7 hours after
  • 7. Thex Network Science: Introduction A SIMPLE STORY (2): August 15, 2003 blackout. An important theme of this class: we must understand how network structure affects the robustness of a complex system. develop quantitative tools to assess the interplay between network structure and the dynamical processes on the networks, and their impact on failures. We will learn that failures reality failures follow reproducible laws, that can be quantified and even predicted using the tools of network science.
  • 8. NETWORKS AT THE HEART OF COMPLEX SYSTEMS Section 4 NETWORKS AT THE HEART OF COMPLEX SYSTEMS Network Science: Introduction
  • 9. [adj., v. kuh m-pleks, kom-pleks; n. kom-pleks] –adjective 1. composed of many interconnected parts; compound; composite: a complex highway system. 2. characterized by a very complicated or involved arrangement of parts, units, etc.: complex machinery. 3. so complicated or intricate as to be hard to understand or deal with: a complex problem. Source: Dictionary.com COMPLEX SYSTEMS Complexity, a scientific theory which asserts that some systems display behavioral phenomena that are completely inexplicable by any conventional analysis of the systems’ constituent parts. These phenomena, commonly referred to as emergent behaviour, seem to occur in many complex systems involving living organisms, such as a stock market or the human brain. Source: John L. Casti, Encyclopædia Britannica Network Science: Introduction
  • 10. THE ROLE OF NETWORKS Behind each complex system there is a network, that defines the interactions between the component. Network Science: Introduction
  • 11. Keith Shepherd's "Sunday Best”. http://baseballart.com/2010/07/shades-of-greatness-a-story-that-needed-to-be-told/ The “Social Graph” behind Facebook SOCIETY Factoid: Network Science: Introduction
  • 12. : departments : consultants : external experts www.orgnet.com STRUCTURE OF AN ORGANIZATION Network Science: Introduction
  • 13. Brain Human Brain has between 10-100 billion neurons. BRAIN Factoid: Network Science: Introduction
  • 14. The subtle financial networks Network Science: Introduction
  • 15. The not so subtle financial networks: 2011 Network Science: Introduction
  • 16. Nodes: Links: http://ecclectic.ss.uci.edu/~drwhite/Movie BUSINESS TIES IN US BIOTECH-INDUSTRY Companies Investment Pharma Research Labs Public Biotechnology Collaborations Financial R&D Network Science: Introduction
  • 18. Drosophila Melanogaster Homo Sapiens In the generic networks shown, the points represent the elements of each organism’s genetic network, and the dotted lines show the interactions between them. HUMANS GENES Network Science: Introduction
  • 19. Complex systems Made of many non-identical elements connected by diverse interactions. NETWORK HUMANS GENES Drosophila Melanogaster Homo Sapiens Network Science: Introduction
  • 20. THE ROLE OF NETWORKS Network Science: Introduction Behind each system studied in complexity there is an intricate wiring diagram, or a network, that defines the interactions between the component. We will never understand complex system unless we map out and understand the networks behind them.
  • 21. TWO FORCES HELPED THE EMERGENCE OF NETWORK SCIENCE Section 5 Network Science: Introduction
  • 22. Graph theory: 1735, Euler Social Network Research: 1930s, Moreno Communication networks/internet: 1960s Ecological Networks: May, 1979. THE HISTORY OF NETWORK ANALYSIS Network Science: Introduction
  • 23. THE HISTORY OF NETWORK ANALYSIS Network Science: Introduction (Random Networks in Graph Theory) (Social Network)
  • 24. The emergence of network maps: Network Science: Introduction THE EMERGENCE OF NETWORK SCIENCE Movie Actor Network, 1998; World Wide Web, 1999. C elegans neural wiring diagram 1990 Citation Network, 1998 Metabolic Network, 2000; Protein-protein Interaction (PPI) network, 2001
  • 25. The universality of network characteristics: Network Science: Introduction THE EMERGENCE OF NETWORK SCIENCE The architecture of networks emerging in various domains of science, nature, and technology are more similar to each other than one would have expected.
  • 26. THE CHARACTERISTICS OF NETWORK SCIENCE Section 6 Network Science: Introduction
  • 27. Interdisciplinary Quantitative and Mathematical Computational Empirical Network Science: Introduction THE CHARACTERISTICS OF NETWORK SCIENCE
  • 28. Interdisciplinary Quantitative and Mathematical Computational Empirical, data driven Network Science: Introduction THE CHARACTERISTICS OF NETWORK SCIENCE
  • 29. Interdisciplinary Quantitative and Mathematical Computational Empirical Network Science: Introduction THE CHARACTERISTICS OF NETWORK SCIENCE
  • 30. Interdisciplinary Quantitative and Mathematical Computational Empirical Network Science: Introduction THE CHARACTERISTICS OF NETWORK SCIENCE
  • 31. THE IMPACT OF NETWORK SCIENCE Section 7 Network Science: Introduction
  • 32. Google Market Cap(2010 Jan 1): $189 billion Cisco Systems networking gear Market cap (Jan 1, 2919): $112 billion Facebook market cap: $50 billion www.bizjournals.com/austin/news/2010/11/ 15/facebooks... - Cached Network Science: Introduction ECONOMIC IMPACT
  • 33. Reduces Inflammation Fever Pain Prevents Heart attack Stroke Causes Bleeding Ulcer Reduces the risk of Alzheimer's Disease COX2 Reduces the risk of breast cancer ovarian cancers colorectal cancer DRUG DESIGN, METABOLIC ENGINEERING: Network Science: Introduction
  • 34. DRUG DESIGN, METABOLIC ENGINEERING:
  • 36. Network Science: Introduction The network behind a military engagement
  • 37. Thex Predicting the H1N1 pandemic Network Science: Introduction
  • 38. Thex In September 2010 the National Institutes of Health awarded $40 million to researchers at Harvard, Washington University in St. Louis, the University of Minnesota and UCLA, to develop the technologies that could systematically map out brain circuits. The Human Connectome Project (HCP) with the ambitious goal to construct a map of the complete structural and functional neural connections in vivo within and across individuals. http://www.humanconnectomeproject.org/overview/ Network Science: Introduction BRAIN RESEARCH
  • 43. SCIENTIFIC IMPACT Section 8 Network Science: Introduction
  • 44. NETWORK SCIENCE The science of the 21st century Network Science: Introduction Years Times cited
  • 47. Thex If you were to understand the spread of diseases, can you do it without networks? If you were to understand the WWW structure, searchability, etc, hopeless without invoking the Web’s topology. If you want to understand human diseases, it is hopeless without considering the wiring diagram of the cell. Network Science: Introduction MOST IMPORTANT Networks Really Matter
  • 48. Some Graph Properties Section 10 Network Science: Introduction
  • 49. Degree distribution P(k): probability that a randomly chosen node has degree k Nk = # nodes with degree k P(k) = Nk / N plot DEGREE DISTRIBUTION
  • 50. The maximum number of links a network of N nodes can have is: Lmax = N 2       = N(N −1) 2 A graph with degree L=Lmax is called a complete graph, and its average degree is <k>=N-1 Network Science: Graph Theory COMPLETE GRAPH
  • 51. Most networks observed in real systems are sparse: L << Lmax or <k> <<N-1. WWW (ND Sample): N=325,729; L=1.4 106 Lmax=1012 <k>=4.51 Protein (S. Cerevisiae): N= 1,870; L=4,470 Lmax=107 <k>=2.39 Coauthorship (Math): N= 70,975; L=2 105 Lmax=3 1010 <k>=3.9 Movie Actors: N=212,250; L=6 106 Lmax=1.8 1013 <k>=28.78 (Source: Albert, Barabasi, RMP2002) Network Science: Graph Theory REAL NETWORKS ARE SPARSE
  • 52. ADJACENCY MATRICES ARE SPARSE Network Science: Graph Theory
  • 53. bipartite graph (or bigraph) is a graph whose nodes can be divided into two disjoint sets U and V such that every link connects a node in U to one in V; that is, U and V are independent sets. Examples: Hollywood actor network Collaboration networks Disease network (diseasome) BIPARTITE GRAPHS Network Science: Graph Theory
  • 54. Gene network GENOME PHENOME DISEASOME Disease network Goh, Cusick, Valle, Childs, Vidal & Barabási, PNAS (2007) GENE NETWORK – DISEASE NETWORK Network Science: Graph Theory
  • 56. Y.-Y. Ahn, S. E. Ahnert, J. P. Bagrow, A.-L. Barabási Flavor network and the principles of food pairing , Scientific Reports 196, (2011). Ingredient-Flavor Bipartite Network Network Science: Graph Theory
  • 59. Clustering coefficient: what fraction of your neighbors are connected? Node i with degree ki Ci in [0,1] Network Science: Graph Theory CLUSTERING COEFFICIENT Watts & Strogatz, Nature 1998.
  • 60. Clustering coefficient: what fraction of your neighbors are connected? Node i with degree ki Ci in [0,1] Network Science: Graph Theory CLUSTERING COEFFICIENT Watts & Strogatz, Nature 1998.
  • 63. The random network model Section 12.1
  • 64. Erdös-Rényi model (1960) Connect with probability p p=1/6 N=10 <k> ~ 1.5 Pál Erdös (1913-1996) Alfréd Rényi (1921-1970) RANDOM NETWORK MODEL
  • 65. RANDOM NETWORK MODEL Network Science: Random Definition: A random graph is a graph of N nodes where each pair of nodes is connected by probability p.
  • 68. The number of links is variable Section 12.2
  • 70. Number of links in a random network P(L): the probability to have exactly L links in a network of N nodes and probability p: Network Science: Random Graphs P(L) = N 2       L           pL (1− p) N(N−1) 2 −L The maximum number of links in a network of N nodes. Number of different ways we can choose L links among all potential links. Binomial distribution...
  • 71. MATH TUTORIAL Binomial Distribution: The bottom line Network Science: Random Graphs http://keral2008.blogspot.com/2008/10/derivation-of-mean-and-variance-of.html P(x) = N x      px (1− p)N−x < x >= Np < x2 >= p(1− p)N + p2 N2 σx = (< k2 > − < k >2 )1/ 2 = [p(1− p)N]1/ 2
  • 72. RANDOM NETWORK MODEL P(L): the probability to have a network of exactly L links Network Science: Random Graphs P(L) = N 2       L           pL (1− p) N(N−1) 2 −L < L >= LP(L) = p N(N −1) 2 L= 0 N(N−1) 2 ∑ •The average number of links <L> in a random graph •The standard deviation σ2 = p(1− p) N(N −1) 2 < k >= 2L/N = p(N −1)
  • 74. DEGREE DISTRIBUTION OF A RANDOM GRAPH Network Science: Random Graphs As the network size increases, the distribution becomes increasingly narrow—we are increasingly confident that the degree of a node is in the vicinity of <k>. Select k nodes from N-1 probability of having k edges probability of missing N-1-k edges P(k) = N −1 k      pk (1− p)(N−1)−k < k >= p(N −1) σk 2 = p(1− p)(N −1) σk < k > = 1− p p 1 (N −1)       1/2 ≈ 1 (N −1)1/ 2
  • 75. DEGREE DISTRIBUTION OF A RANDOM GRAPH Network Science: Random Graphs P(k) = N −1 k      pk (1− p)(N−1)−k < k >= p(N −1) p = < k > (N −1) For large N and small k, we can use the following approximations: N −1 k       = (N −1)! k!(N −1− k)! = (N −1)(N −1−1)(N −1− 2)...(N −1− k +1)(N −1− k)! k!(N −1− k)! = (N −1)k k! ln[(1− p)(N−1)−k ] = (N −1− k)ln(1− < k > N −1 ) = −(N −1− k) < k > N −1 = − < k > (1− k N −1 ) ≅ − < k > (1− p)(N−1)−k = e−<k> P(k) = N −1 k      pk (1− p)(N−1)−k = (N −1)k k! pk e−<k> = (N −1)k k! < k > N −1       k e−<k> = e−<k> < k >k k! ln 1+ x ( ) = −1 ( ) n+1 n n=1 ∞ ∑ xn = x − x2 2 + x3 3 − ... for x ≤1
  • 76. POISSON DEGREE DISTRIBUTION Network Science: Random Graphs P(k) = N −1 k      pk (1− p)(N−1)−k < k >= p(N −1) p = < k > (N −1) For large N and small k, we arrive to the Poisson distribution: P(k) = e−< k> < k >k k!
  • 77. DEGREE DISTRIBUTION OF A RANDOM GRAPH Network Science: Random Graphs P(k) k P(k) = e−<k> < k >k k! <k>=50
  • 78. DEGREE DISTRIBUTION OF A RANDOM NETWORK Exact Result -binomial distribution- Large N limit -Poisson distribution- Probability Distribution Function (PDF)
  • 79. Real Networks are not Poisson Section 12.4
  • 80. Section 12.5 Maximum and minimum degree kmax=1,185 <k>=1,000, N=109 P k e k k ( ) ! min k k k k 0 min ∑ = 〈 〉 −〈 〉 = . <k>=1,000, N=109 kmin=816
  • 81. NO OUTLIERS IN A RANDOM SOCIETY Network Science: Random Graphs The most connected individual has degree kmax~1,185 The least connected individual has degree kmin ~ 816 The probability to find an individual with degree k>2,000 is 10-27. Hence the chance of finding an individual with 2,000 acquaintances is so tiny that such nodes are virtually inexistent in a random society. a random society would consist of mainly average individuals, with everyone with roughly the same number of friends. It would lack outliers, individuals that are either highly popular or recluse. P(k) = e−<k> < k >k k!
  • 82. FACING REALITY: Degree distribution of real networks P(k) = e−<k> < k >k k!
  • 83. The evolution of a random network Section 13
  • 85. <k> EVOLUTION OF A RANDOM NETWORK disconnected nodes NETWORK. How does this transition happen?
  • 86. <kc>=1 (Erdos and Renyi, 1959) EVOLUTION OF A RANDOM NETWORK disconnected nodes NETWORK. The fact that at least one link per node is necessary to have a giant component is not unexpected. Indeed, for a giant component to exist, each of its nodes must be linked to at least one other node. It is somewhat unexpected, however that one link is sufficient for the emergence of a giant component. It is equally interesting that the emergence of the giant cluster is not gradual, but follows what physicists call a second order phase transition at <k>=1.
  • 89. <k> EVOLUTION OF A RANDOM NETWORK disconnected nodes NETWORK. How does this transition happen?
  • 90. Phase transitions in complex systems I: Magnetism
  • 91. Phase transitions in complex systems I: liquids Water Ice
  • 92. CLUSTER SIZE DISTRIBUTION p(s) = e−< k>s (< k > s)s−1 s ! Probability that a randomly selected node belongs to a cluster of size s: Network Science: Random Graphs At the critical point <k>=1 The distribution of cluster sizes at the critical point, displayed in a log-log plot. The data represent an average over 1000 systems of sizes The dashed line has a slope of −τn = −2.5 Derivation in Newman, 2010 k s−1 = exp[(s−1)ln k ] p(s) = ss−1 s ! e− k s+(s−1)ln k s != 2πs s e       s p(s) ~ s−3 / 2 e−( k −1)s+(s−1)ln k p(s) ~ s−3 / 2
  • 93. I: Subcritical <k> < 1 III: Supercritical <k> > 1 IV: Connected <k> > ln N II: Critical <k> = 1 <k>=0.5 <k>=1 <k>=3 <k>=5 N=100 <k>
  • 94. I: Subcritical <k> < 1 p < pc=1/N <k> No giant component. N-L isolated clusters, cluster size distribution is exponential The largest cluster is a tree, its size ~ ln N p(s) ~ s−3 / 2 e−( k −1)s+(s−1)ln k
  • 95. II: Critical <k> = 1 p=pc=1/N <k> Unique giant component: NG~ N2/3 contains a vanishing fraction of all nodes, NG/N~N-1/3 Small components are trees, GC has loops. Cluster size distribution: p(s)~s-3/2 A jump in the cluster size: N=1,000 ln N~ 6.9; N2/3~95 N=7 109 ln N~ 22; N2/3~3,659,250
  • 96. <k>=3 <k> Unique giant component: NG~ (p-pc)N GC has loops. Cluster size distribution: exponential III: Supercritical <k> > 1 p > pc=1/N p(s) ~ s−3 / 2 e−( k −1)s+(s−1)ln k
  • 97. IV: Connected <k> > ln N p > (ln N)/N <k>=5 <k> Only one cluster: NG=N GC is dense. Cluster size distribution: None
  • 99. Network evolution in graph theory A graph has a given property Q if the probability of having Q ap- proaches 1 as N ∞ . That is, f or a given z either almost every graph has the property Q or almost no graph has it. For example, f or z less p =< k > /(N −1)
  • 101. Real networks are supercritical Section 13.2
  • 104. Frigyes Karinthy, 1929 Stanley Milgram, 1967 Peter Jane Sarah Ralph SIX DEGREES small worlds
  • 105. SIX DEGREES 1929: Frigyes Kartinthy Frigyes Karinthy (1887-1938) Hungarian Writer Network Science: Random Graphs “Look, Selma Lagerlöf just won the Nobel Prize for Literature, thus she is bound to know King Gustav of Sweden, after all he is the one who handed her the Prize, as required by tradition. King Gustav, to be sure, is a passionate tennis player, who always participates in international tournaments. He is known to have played Mr. Kehrling, whom he must therefore know for sure, and as it happens I myself know Mr. Kehrling quite well.” "The worker knows the manager in the shop, who knows Ford; Ford is on friendly terms with the general director of Hearst Publications, who last year became good friends with Arpad Pasztor, someone I not only know, but to the best of my knowledge a good friend of mine. So I could easily ask him to send a telegram via the general director telling Ford that he should talk to the manager and have the worker in the shop quickly hammer together a car for me, as I happen to need one." 1929: Minden másképpen van (Everything is Different) Láncszemek (Chains)
  • 106. SIX DEGREES 1967: Stanley Milgram Network Science: Random Graphs HOW TO TAKE PART IN THIS STUDY 1. ADD YOUR NAME TO THE ROSTER AT THE BOTTOM OF THIS SHEET, so that the next person who receives this letter will know who it came from. 2. DETACH ONE POSTCARD. FILL IT AND RETURN IT TO HARVARD UNIVERSITY. No stamp is needed. The postcard is very important. It allows us to keep track of the progress of the folder as it moves toward the target person. 3. IF YOU KNOW THE TARGET PERSON ON A PERSONAL BASIS, MAIL THIS FOLDER DIRECTLY TO HIM (HER). Do this only if you have previously met the target person and know each other on a first name basis. 4. IF YOU DO NOT KNOW THE TARGET PERSON ON A PERSONAL BASIS, DO NOT TRY TO CONTACT HIM DIRECTLY. INSTEAD, MAIL THIS FOLDER (POST CARDS AND ALL) TO A PERSONAL ACQUAINTANCE WHO IS MORE LIKELY THAN YOU TO KNOW THE TARGET PERSON. You may send the folder to a friend, relative or acquaintance, but it must be someone you know on a first name basis.
  • 107. SIX DEGREES 1967: Stanley Milgram Network Science: Random Graphs
  • 108. SIX DEGREES 1991: John Guare Network Science: Random Graphs "Everybody on this planet is separated by only six other people. Six degrees of separation. Between us and everybody else on this planet. The president of the United States. A gondolier in Venice…. It's not just the big names. It's anyone. A native in a rain forest. A Tierra del Fuegan. An Eskimo. I am bound to everyone on this planet by a trail of six people. It's a profound thought. How every person is a new door, opening up into other worlds."
  • 109. WWW: 19 DEGREES OF SEPARATION Image by Matthew Hurst Blogosphere Network Science: Random Graphs
  • 110. DISTANCES IN RANDOM GRAPHS Random graphs tend to have a tree-like topology with almost constant node degrees. Network Science: Random Graphs dmax = logN log k N =1+ k + k 2 + ...+ k dmax = k dmax +1 −1 k −1 ≈ k dmax
  • 111. DISTANCES IN RANDOM GRAPHS Network Science: Random Graphs dmax = logN log k < d >= logN log k We will call the small world phenomena the property that the average path length or the diameter depends logarithmically on the system size. Hence, ”small” means that ⟨d⟩ is proportional to log N, rather than N. In most networks this offers a better approximation to the average distance between two randomly chosen nodes, ⟨d⟩, than to dmax . The 1/log⟨k⟩ term implies that denser the network, the smaller will be the distance between the nodes.
  • 112. Network Science: Graph Theory Average Degree
  • 113. Given the huge differences in scope, size, and average degree, the agreement is excellent. DISTANCES IN RANDOM GRAPHS compare with real data
  • 114. Why are small worlds surprising? Suprising compared to what? Network Science: Random Graphs
  • 115. Three, Four or Six Degrees? For the globe’s social networks: ⟨k⟩ ≃ 103 N ≃ 7 × 109 for the world’s population. < d >= ln(N) ln k = 3.28
  • 116. Image by Matthew Hurst Blogosphere
  • 118. Since edges are independent and have the same probability p, < Li >≅ p ki (ki −1) 2 •The clustering coefficient of random graphs is small. •For fixed degree C decreases with the system size N. •C is independent of a node’s degree k. Ci ≡ 2 < Li > ki (ki −1) CLUSTERING COEFFICIENT
  • 119. C decreases with the system size N. C is independent of a node’s degree k. Network Science: Random Graphs CLUSTERING COEFFICIENT
  • 120. Image by Matthew Hurst Blogosphere Watts-Strogatz Model
  • 121. Real networks are not random Section 10
  • 122. As quantitative data about real networks became available, we can compare their topology with the predictions of random graph theory. Note that once we have N and <k> for a random network, from it we can derive every measurable property. Indeed, we have: Average path length: Clustering Coefficient: Degree Distribution: < lrand >≈ logN log k ARE REAL NETWORKS LIKE RANDOM GRAPHS? Network Science: Random Graphs P(k) = e−<k> < k >k k!
  • 123. Real networks have short distances like random graphs. Prediction: PATH LENGTHS IN REAL NETWORKS Network Science: Random Graphs < d >= logN log k
  • 124. Prediction: Crand underestimates with orders of magnitudes the clustering coefficient of real networks. CLUSTERING COEFFICIENT Network Science: Random Graphs
  • 125. P(k) ≈ k−γ Prediction: Data: THE DEGREE DISTRIBUTION Network Science: Random Graphs P(k) = e−<k> < k >k k!
  • 126. As quantitative data about real networks became available, we can compare their topology with the predictions of random graph theory. Note that once we have N and <k> for a random network, from it we can derive every measurable property. Indeed, we have: Average path length: Clustering Coefficient: Degree Distribution: < lrand >≈ logN log k ARE REAL NETWORKS LIKE RANDOM GRAPHS? Network Science: Random Graphs P(k) = e−<k> < k >k k!
  • 127. (B) Most important: we need to ask ourselves, are real networks random? The answer is simply: NO There is no network in nature that we know of that would be described by the random network model. IS THE RANDOM GRAPH MODEL RELEVANT TO REAL SYSTEMS? Network Science: Random Graphs
  • 128. It is the reference model for the rest of the class. It will help us calculate many quantities, that can then be compared to the real data, understanding to what degree is a particular property the result of some random process. Patterns in real networks that are shared by a large number of real networks, yet which deviate from the predictions of the random network model. In order to identify these, we need to understand how would a particular property look like if it is driven entirely by random processes. While WRONG and IRRELEVANT, it will turn out to be extremly USEFUL! IF IT IS WRONG AND IRRELEVANT, WHY DID WE DEVOT TO IT A FULL CLASS? Network Science: Random Graphs
  • 130. Nodes: WWW documents Links: URL links Over 3 billion documents ROBOT: collects all URL’s found in a document and follows them recursively WORLD WIDE WEB R. Albert, H. Jeong, A-L Barabasi, Nature, 401 130 (1999).
  • 131. Power laws and scale-free networks Section 2
  • 132. Nodes: WWW documents Links: URL links Over 3 billion documents ROBOT: collects all URL’s found in a document and follows them recursively R. Albert, H. Jeong, A-L Barabasi, Nature, 401 130 (1999). WORLD WIDE WEB Network Science: Scale-Free Property
  • 133. Discrete vs. Continuum formalism Network Science: Scale-Free Property Discrete Formalism As node degrees are always positive integers, the discrete formalism captures the probability that a node has exactly k links: Continuum Formalism In analytical calculations it is often convenient to assume that the degrees can take up any positive real value: INTERPRETATION:
  • 134. 80/20 RULE Vilfredo Federico Damaso Pareto (1848 – 1923), Italian economist, political scientist and philosopher, who had important contributions to our understanding of income distribution and to the analysis of individuals choices. A number of fundamental principles are named after him, like Pareto efficiency, Pareto distribution (another name for a power-law distribution), the Pareto principle (or 80/20 law).
  • 136. The difference between a power law and an exponential distribution
  • 137. The difference between a power law and an exponential distribution Let us use the WWW to illustrate the properties of the high-k regime. The probability to have a node with k~100 is •About in a Poisson distribution •About if pk follows a power law. •Consequently, if the WWW were to be a random network, according to the Poisson prediction we would expect 10-18 k>100 degree nodes, or none. •For a power law degree distribution, we expect about k>100 degree nodes
  • 139. Finite scale-free networks All real networks are finite let us explore its consequences. We have an expected maximum degree, kmax Estimating kmax P(k)dk kmax ∞ ∫ ≈ 1 N kmax = kmin N 1 γ −1 Why: the probability to have a node larger than kmax should not exceed the prob. to have one node, i.e. 1/N fraction of all nodes P(k)dk kmax ∞ ∫ = (γ −1)kmin γ −1 k−γ dk kmax ∞ ∫ = (γ −1) (−γ +1) kmin γ −1 k−γ +1    kmax ∞ = kmin γ −1 kmax γ −1 ≈ 1 N The size of the biggest hub
  • 140. Finite scale-free networks kmax = kmin N 1 γ −1 The size of the biggest hub
  • 141. Finite scale-free networks Expected maximum degree, kmax kmax = kmin N 1 γ −1 •kmax, increases with the size of the network the larger a system is, the larger its biggest hub •For γ>2 kmax increases slower than N the largest hub will contain a decreasing fraction of links as N increases. •For γ=2 kmax~N. The size of the biggest hub is O(N) •For γ<2 kmax increases faster than N: condensation phenomena the largest hub will grab an increasing fraction of links. Anomaly!
  • 142. Finite scale-free networks kmax = kmin N 1 γ −1 The size of the largest hub
  • 143. The meaning of scale-free Section 4
  • 144. Definition: Networks with a power law tail in their degree distribution are called ‘scale-free networks’ Where does the name come from? Critical Phenomena and scale-invariance (a detour) Slides after Dante R. Chialvo Scale-free networks: Definition Network Science: Scale-Free Property
  • 145. Phase transitions in complex systems I: Magnetism T = 0.99 Tc T = 0.999 Tc ξ ξ T = Tc T = 1.5 Tc T = 2 Tc Network Science: Scale-Free Property
  • 146. At T = Tc: correlation length diverges Fluctuations emerge at all scales: scale-free behavior Scale-free behavior in space Network Science: Scale-Free Property
  • 147. • Correlation length diverges at the critical point: the whole system is correlated! • Scale invariance: there is no characteristic scale for the fluctuation (scale-free behavior). • Universality: exponents are independent of the system’s details. CRITICAL PHENOMENA Network Science: Scale-Free Property
  • 148. C = 1 k−γ dk kmin ∞ ∫ = (γ −1)kmin γ −1 P(k) = Ck−γ k = [kmin ,∞) P(k) kmin ∞ ∫ dk = 1 P(k) = (γ −1)kmin γ −1 k−γ Divergences in scale-free distributions Network Science: Scale-Free Property < km >= km P(k)dk kmin ∞ ∫ < km >= (γ −1)kmin γ −1 km−γ dk kmin ∞ ∫ = (γ −1) (m−γ +1) kmin γ −1 km−γ +1    kmin ∞ If m-γ+1<0: < km >= − (γ −1) (m−γ +1) kmin m If m-γ+1>0, the integral diverges. For a fixed γ this means that all moments with m>γ-1 diverge.
  • 149. < km >= (γ −1)kmin γ −1 km−λ dk kmin ∞ ∫ = (γ −1) (m−γ +1) kmin γ −1 km−γ +1    kmin ∞ For a fixed λ this means all moments m>γ-1 diverge. Many degree exponents are smaller than 3 <k2> diverges in the N ∞ limit!!! DIVERGENCE OF THE HIGHER MOMENTS Network Science: Scale-Free Property
  • 150. The meaning of scale-free
  • 151. The meaning of scale-free
  • 153. (Faloutsos, Faloutsos and Faloutsos, 1999) Nodes: computers, routers Links: physical lines INTERNET BACKBONE Network Science: Scale-Free Property
  • 155. (γ γ γ γ = 3) (S. Redner, 1998) P(k) ~k-γ γ γ γ 1736 PRL papers (1988) SCIENCE CITATION INDEX Nodes: papers Links: citations 578... 25 H.E. Stanley,... Network Science: Scale-Free Property
  • 156. SCIENCE COAUTHORSHIP M: math NS: neuroscience Nodes: scientist (authors) Links: joint publication (Newman, 2000, Barabasi et al 2001) Network Science: Scale-Free Property
  • 157. Nodes: online user Links: email contact Ebel, Mielsch, Bornholdtz, PRE 2002. Kiel University log files 112 days, N=59,912 nodes Pussokram.com online community; 512 days, 25,000 users. Holme, Edling, Liljeros, 2002. ONLINE COMMUNITIES
  • 158. ONLINE COMMUNITIES Twitter: Jake Hoffman, Yahoo, Facebook Brian Karrer, Lars Backstrom, Cameron Marlowm 2011
  • 159. Organisms from all three domains of life are scale-free! H. Jeong, B. Tombor, R. Albert, Z.N. Oltvai, and A.L. Barabasi, Nature, 407 651 (2000) Archaea Bacteria Eukaryotes 2 2 2 2 . . ) ( ) ( − − − − − − − − ≈ ≈ ≈ ≈ ≈ ≈ ≈ ≈ k k P k k P out in METABOLIC NETWORK Network Science: Scale-Free Property
  • 160. Nodes: actors Links: cast jointly N = 212,250 actors 〈k〉 = 28.78 P(k) ~k-γ Days of Thunder (1990) Far and Away (1992) Eyes Wide Shut (1999) γ=2.3 ACTOR NETWORK
  • 161. Nodes: people (Females; Males) Links: sexual relationships Liljeros et al. Nature 2001 4781 Swedes; 18-74; 59% response rate. SWEDISH SE-WEB Network Science: Scale-Free Property
  • 163. Not all networks are scale-free •Networks appearing in material science, like the network describing the bonds between the atoms in crystalline or amorphous materials, where each node has exactly the same degree. •The neural network of the C.elegans worm. •The power grid, consisting of generators and switches connected by transmission lines
  • 165. DISTANCES IN RANDOM GRAPHS Random graphs tend to have a tree-like topology with almost constant node degrees. • nr. of first neighbors: • nr. of second neighbors: •nr. of neighbours at distance d: • estimate maximum distance: k log N log lmax = = = = ∑ ∑ ∑ ∑ = = = = = = = = + + + + max l 1 l i N k 1 k N1 ≅ ≅ ≅ ≅ 2 2 k N ≅ ≅ ≅ ≅ Nd ≅ k d Network Science: Scale-Free Property
  • 166. Distances in scale-free networks Size of the biggest hub is of order O(N). Most nodes can be connected within two layers of it, thus the average path length will be independent of the system size. The average path length increases slower than logarithmically. In a random network all nodes have comparable degree, thus most paths will have comparable length. In a scale-free network the vast majority of the path go through the few high degree hubs, reducing the distances between nodes. Some key models produce γ=3, so the result is of particular importance for them. This was first derived by Bollobas and collaborators for the network diameter in the context of a dynamical model, but it holds for the average path length as well. The second moment of the distribution is finite, thus in many ways the network behaves as a random network. Hence the average path length follows the result that we derived for the random network model earlier. Cohen, Havlin Phys. Rev. Lett. 90, 58701(2003); Cohen, Havlin and ben-Avraham, in Handbook of Graphs and Networks, Eds. Bornholdt and Shuster (Willy-VCH, NY, 2002) Chap. 4; Confirmed also by: Dorogovtsev et al (2002), Chung and Lu (2002); (Bollobas, Riordan, 2002; Bollobas, 1985; Newman, 2001 Ultra Sm all W orld Small World SMALL WORLD BEHAVIOR IN SCALE-FREE NETWORKS kmax = kmin N 1 γ −1
  • 167. Why are small worlds surprising? Suprising compared to what? Network Science: Random Graphs January 31, 2011
  • 168. Distances in scale-free networks SMALL WORLD BEHAVIOR IN SCALE-FREE NETWORKS
  • 169. We are always close to the hubs " it's always easier to find someone who knows a famous or popular figure than some run-the- mill, insignificant person.” (Frigyes Karinthy, 1929)
  • 170. The role of the degree exponent Section 7
  • 171. SUMMARY OF THE BEHAVIOR OF SCALE-FREE NETWORKS
  • 172. Distances in scale-free networks Graphicality: No large networks for γ<2 kmax = kmin N 1 γ −1 In scale-free networks: For γ<2: 1/(γ-2)>1
  • 173. Kmax = Kmin N 1 γ −1 In order to document a scale-free networks, we need 2-3 orders of magnitude scaling. That is, Kmax~ 103 However, that constrains on the system size we require to document it. For example, to measure an exponent γ=5,we need to maximum degree a system size of the order of N = Kmax Kmin       γ−1 ≈108 Onella et al. PNAS 2007 N=4.6x106 γ=8.4 Mobile Call Network Why don’t we see networks with exponents in the range of γ=4,5,6, etc? Network Science: Scale-Free Property
  • 175. 2,800 Y2H interactions 4,100 binary LC interactions (HPRD, MINT, BIND, DIP, MIPS) Rual et al. Nature 2005; Stelze et al. Cell 2005 HUMAN INTERACTION NETWORK Network Science: Scale-Free Property
  • 176. (linear scale) Network Science: Scale-Free Property
  • 177. (linear scale) P(k) ~ (k+k0)-γ k0 = 1.4, γ=2.6. HUMAN INTERACTION DATA BY RUAL ET AL. Network Science: Scale-Free Property
  • 179. Generating networks with a pre- defined pk Section 8
  • 180. Configuration model (1) Degree sequence: Assign a degree to each node, represented as stubs or half-links. The degree sequence is either generated analytically from a preselected distribution (Box 4.5), or it is extracted from the adjacency matrix of a real network. We must start from an even number of stubs, otherwise we will be left with unpaired stubs. (2) Network assembly: Randomly select a stub pair and connect them. Then randomly choose another pair from the remaining stubs and connect them. This procedure is repeated until all stubs are paired up. Depending on the order in which the stubs were chosen, we obtain different networks. Some networks include cycles (2a), others self-edges (2b) or multi-edges (2c). Yet, the expected number of self- and multi-edges goes to zero in the limit.
  • 183. Hidden parameter model Start with N isolated nodes and assign to each node a “hidden parameter” η , which can be randomly selected from a ρ(η) distribution. We next connect each node pair with probability For example, the figure shows the probability to connect nodes (1,3) and (3,4). After connecting the nodes, we end up with the networks shown in (b) or (c), representing two independent realizations generated by the same hidden parameter sequence (a). The expected number of links in the obtained network is
  • 185. Case Study: PPI Network
  • 186. Something to keep in mind
  • 190. Section 1 Hubs represent the most striking difference between a random and a scale-free network. Their emergence in many real systems raises several fundamental questions: •Why does the random network model of Erdős and Rényi fail to reproduce the hubs and the power laws observed in many real networks? • Why do so different systems as the WWW or the cell converge to a similar scale-free architecture?
  • 191. Growth and preferential attachment Section 2
  • 192. networks expand through the addition of new nodes Barabási & Albert, Science 286, 509 (1999) BA MODEL: Growth ER model: the number of nodes, N, is fixed (static models)
  • 193. New nodes prefer to connect to the more connected nodes Barabási & Albert, Science 286, 509 (1999) Network Science: Evolving Network Models BA MODEL: Preferential attachment ER model: links are added randomly to the network
  • 194. Barabási & Albert, Science 286, 509 (1999) Network Science: Evolving Network Models Section 2: Growth and Preferential Sttachment The random network model differs from real networks in two important characteristics: Growth: While the random network model assumes that the number of nodes is fixed (time invariant), real networks are the result of a growth process that continuously increases. Preferential Attachment: While nodes in random networks randomly choose their interaction partner, in real networks new nodes prefer to link to the more connected nodes.
  • 196. Barabási & Albert, Science 286, 509 (1999) P(k) ~k-3 (1) Networks continuously expand by the addition of new nodes WWW : addition of new documents GROWTH: add a new node with m links PREFERENTIAL ATTACHMENT: the probability that a node connects to a node with k links is proportional to k. (2) New nodes prefer to link to highly connected nodes. WWW : linking to well known sites Network Science: Evolving Network Models Origin of SF networks: Growth and preferential attachment j j i i k k k Σ = Π ) (
  • 199. The structure and function of complex networks Section 3
  • 200. Section 4 (a) A food web of predator-prey interactions between species in a freshwater lake [272]. Picture courtesy of Neo Martinez and Richard Williams. (b) The network of collaborations between scientists at a private research institution [171]. (c) A network of sexual contacts between individuals in the study by Potterat et al. [342].
  • 202. Section 4 Protein interaction map of yeast Hubs Avoiding hubs Celebrity Couples Hubs Dating Hubs The probability that nodes with degrees k and k′ link to each other:
  • 203. Section 4 Associativity and Dissociativity
  • 204. Section 4 Associativity and Dissociativity Politics is Never Neutral The network behind the US political blogosphere illustrates the presence of associative mixing, as used in sociology, nodes of similar caracteristics tend to link to each other.
  • 206. Section 4 “Robust” comes from the latin Quercus Robur, meaning oak, the symbol of strength and longevity in the ancient world. The tree in the figure stands near the Hungarian village Diosviszlo and is documented at www.dendromania. hu, a site that catalogs Hungary's oldest and largest trees. Image courtesy of Gyorgy Posfai.
  • 207. Section 4 Breakdown Thresholds Under Random Failures and Attacks The table shows the estimated fc for random node failures (second column) and attacks (fourth column) for ten reference networks. The third column (randomized network) offers fc for a network whose N and L coincides with the original network, but whose nodes are connected randomly to each other (randomized network. For most networks fc for random failures exceeds fc for the corresponding randomized network, indicating that these networks display enhanced robustness. Three networks lack this property: the power grid, a consequence of the fact that its degree distribution is exponential and the actor and the citation networks, which have a very high ⟨k⟩, diminishing the role of the high ⟨k2⟩.
  • 209. Section 4 Communities in Belgium Communities extracted from the call pattern of the consumers of the largest Belgian mobile phone company. The network has about two million mobile phone users. The nodes correspond to communities, the size of each node being proportional to the number of individuals in the corresponding community. The color of each community on a red–green scale represents the language spoken in the particular community, red for French and green for Dutch. Only communities of more than 100 individuals are shown. The community that connects the two main clusters consists of several smaller communities with less obvious language separation, capturing the culturally mixed Brussels, the country’s capital.
  • 210. Section 4 HIERARCHICAL CLUSTERING AGGLOMERATIVE PROCEDURES: THE RAVASZ ALGORITHM Step 1: Define the Similarity Matrix Step 2: Decide Group Similarity Step 3: Apply Hierarchical Clustering Step 4: Dendrogram DIVISIVE PROCEDURES: THE GIRVAN- NEWMAN ALGORITHM Step 1: Define Centrality Step 2: Hierarchical Clustering
  • 211. Section 4 Here Θ(x) is the Heaviside step function, which is zero for x≤0 and one for x>0; J(i, j) is the number of common neighbors of node i and j, to which we add one (+1) if there is a direct link between i and j; min(ki,kj) is the smaller of the degrees ki and kj. Topological Overlap Matrix
  • 213. Section 4 Centrality Measure Centrality Measures Divisive algorithms require a centrality measure that is high for nodes that belong to different communities and is low for node pairs in the same community. Two frequently used measures can achieve this: (a) Link Betweenness Link betweenness captures the role of each link in information transfer. Hence xij is proportional to the number of shortest paths between all node pairs that run along the link (i,j). Consequently, inter-community links, like the central link in the figure with xij =0.57, have large betweenness. (b) Random-Walk Betweenness A pair of nodes m and n are chosen at random. A walker starts at m, following each adjacent link with equal probability until it reaches n. Random walk betweenness xij is the probability that the link i→j was crossed by the walker after averaging over all possible choices for the starting nodes m and n.
  • 214. Section 4 The Girvan-Newman Algorithm The partition’s modularity is obtained by summing over all nc communities
  • 216. Network Science an interactive textbook barabasi.com/NetworkScienceBook/ facebook.com/NetworkScienceBook