ICPSR - Complex Systems Models in the Social Sciences - Lecture 3 - Professor Daniel Martin Katz

Complex Systems Models
in the Social Sciences
(Lecture 3)
daniel martin katz
illinois institute of technology
chicago kent college of law
@computationaldanielmartinkatz.com computationallegalstudies.com

Back to Where We Ended
Our Last Class

Stanley Milgram’s
Other Experiment
Milgram was interested in the
structure of society
Including the social distance
between individuals
While the term “six degrees” is often
attributed to milgram it can be traced to ideas
from hungarian author Frigyes Karinthy
What is the average distance
between two individuals in
society?

Stanley Milgram’s
Other Experiment
NE
MA

Six Degrees of Separation?
NE
MA
Target person worked in Boston as a stockbroker
296 senders from Boston and Omaha.
20% of senders reached target.
Average chain length = 6.5.
And So the term ...
“Six degrees of Separation”

Six Degrees
Six Degrees is a claim that “average path
length” between two individuals in society
is ~ 6
The idea of ‘Six Degrees’ Popularized
through plays/movies and the kevin bacon
game
http://oracleofbacon.org/

Visualization Source: Duncan J. Watts, Six Degrees
Six Degrees of Kevin Bacon

But What is Wrong
with Milgram’s Logic?
150(150) = 22,500
150 3 = 3,375,000
150 4 = 506,250,000
150 5= 75,937,500,000

The Strength of ‘Weak’ Ties
Does Milgram get
it right? (Mark Granovetter)
Visualization Source: Early Friendster – MIT Network
www.visualcomplexity.com
Strong and Weak Ties
(Clustered
v.
Spanning)
Clustering ----
My Friends’ Friends
are also likely to
be friends

So Was Milgram Correct?
Small Worlds (i.e. Six Degrees) was a theoretical
and an empirical Claim
The Theoretical Account Was Incorrect
The Empirical Claim was still intact
Query as to how could real social networks
display both small worlds and clustering?
At the Same time, the Strength of Weak Ties was
also an Theoretical and Empirical proposition

Watts and Strogatz (1998)
A few random links in an otherwise clustered
graph yields the types of small world
properties found by Milgram
“Randomness” is key bridge between the small
world result and the clustering that is
commonly observed in real social networks

Watts and Strogatz (1998)
A Small Amount of Random Rewiring or
Something akin to Weak Ties—Allows for
Clustering and Small Worlds
Random Graphlocally Clustered

Different Form of
Network Representation
1 mode
2 mode

2 mode
Actors
and
Movies
Different Forms of

1 mode
Actor to Actor
Could be Binary
(0,1)
Did they
Co-Appear?
Different Forms of

Different Forms of
1 mode
Actor to Actor
Could also be
Weighted
(I.E. Edge Weights by
Number of
Co-Appearences)

Features of Networks
Mesoscopic Community Structures
We will discuss these next week
Macroscopic Graph Level Properties
We will discuss these today
Microscopic Node Level Properties
We will discuss these Next week

Macroscopic Graph
Level Properties
Degree Distributions (Outdegree & Indegree)
Clustering Coefficients
Connected Components
Shortest Paths
Density

Shortest Paths
Shortest Paths
The shortest set of links
connecting two nodes
Also, known as the geodesic path
In many graphs, there are multiple
shortest paths

Shortest Paths
Shortest Paths
A and C are connected by
2 shortest paths
A – E – B - C
A – E – D - C
Diameter: the largest geodesic distance
in the graph
The distance between A and C is
the maximum for the graph: 4

Shortest Paths
I n t h e W a t t s - S t r o g a t z M o d e l
Shortest Paths are reduced by
increasing levels of random rewiring

Measure of the tendency of nodes
in a graph to cluster
Both a graph level average for
clustering
Also, a local version which is
interested in cliqueness of a graph

Density
Density = Of the connections
that could exist between n nodes
directed graph: emax = n*(n-1)
(each of the n nodes can connect to (n-1) other nodes)
undirected graph emax = n*(n-1)/2 
(since edges are undirected, count each one only once)
What Fraction are Present?

Density
What fraction are present?
density = e / emax
For example, out of 12 
possible connections..
this graph
this graph has 7,
giving it a density of  
7/12 = 0.58
A “fully connected graph has a density =1

We are often interested in whether
the graph has a single or multiple
connected components
Strong Components
Giant Component
Weak Components

“Largest Weakly Connected Component” in the
SCOTUS Citation Network
There exist cases that are not in this visual as
they are disconnected as of the year 1830
However, by 2009, 99% of SCOTUS Decisions are
in the Largest Weakly Connected Component

Open “Giant Component” from
the netlogo models Library

Notice the
fraction of
nodes in the
giant component
Notice the Size of
the “Giant
Component”
Model has
been
advanced
25+ Ticks

Model has
been
advanced
80+ Ticks
Notice the
fraction of
nodes in the
giant component
Notice the Size of
the “Giant
Component”

Model has
been
advanced
120+ Ticks
Notice the
fraction of
nodes in the
giant component
Notice the Size of the
“Giant Component”
now = “num-nodes”
in the slider

Degree Distributions
outdegree 
how many directed edges (arcs)
originate at a node
indegree 
how many directed edges (arcs) are
incident on a node
degree (in or out) 
number of edges incident on a node
Indegree=3
Outdegree=2
Degree=5

Node Degree
from
Matrix Values
Outdegree:
outdegree for node 3 = 2,
which we obtain by summing
the number of non-zero
entries in the 3rd row
Indegree:
indegree for node 3 = 1,
which we obtain by summing
the number of non-zero
entries in the 3rd column

These are Degree Count for particular nodes
but we are also interested in the distribution
of arcs (or edges) across all nodes
These Distributions are called “degree
distributions”
Degree distribution: A frequency count of
the occurrence of each degree

Imagine we have this 8 node network:
In-degree sequence:
[2, 2, 2, 1, 1, 1, 1, 0]
Out-degree sequence:
[2, 2, 2, 2, 1, 1, 1, 0]
(undirected) degree sequence:
[3, 3, 3, 2, 2, 1, 1, 1]

Imagine we have this 8 node network:
In-degree distribution:
[(2,3) (1,4) (0,1)]
Out-degree distribution:
[(2,4) (1,3) (0,1)]
(undirected) distribution:
[(3,3) (2,2) (1,3)]

Why are Degree
Distributions Useful?
They are the signature of a dynamic process
We will discuss in greater detail tomorrow
Consider several canonical network models

Canonical Network Models
Erdős-Renyi
Random Network
Highly Clustered
Network
Watts-Strogatz
Small World Network
Barabási-Albert
Preferential
Attachment Network

Why are Degree
Distributions Useful?
Barabási-Albert
Preferential
Attachment Network

Power Law /
Scale Free
Networks

Barabási-Albert
Preferential Attachment
Netlogo Models Library --> Networks --> Preferential Attachment
Watch the Changing
Degree Distribution

Barabási-Albert
Preferential Attachment
Netlogo Models Library --> Networks --> Preferential Attachment

Readings on Power law /
Scale free Networks
Check out Lada Adamic’s Power Law Tutorial
Describes distinctions between the Zipf,
Power-law and Pareto distribution
http://www.hpl.hp.com/research/idl/papers/ranking/ranking.html
This is the original paper that gave rise to
all of the other power law networks papers:
A.-L. Barabási & R. Albert, Emergence of scaling in random
networks, Science 286, 509–512 (1999)

Power Laws Seem
to be Everywhere

How Do I Know Something
is Actually a Power Law?

Clauset, Shalizi & Newman
http://arxiv.org/abs/0706.1062
argues for the use of MLE
instead of linear regression
Demonstrates that a number
of prior papers mistakenly
called their distribution a
power law
Here is why you should use
Maximum Likelihood Estimation
(MLE) instead of linear
regression
You recover the power law
when its present
Notice spread between the
Yellow and red lines

Back to the Random Graph
Models for a Moment
Poisson distribution
Erdos-Renyi is the default random
graph model:
randomly draw E edges
between N nodes
There are no hubs in the network
Rather, there exists a narrow
distribution of connectivities

Back to the Random Graph
Models for a Moment
let there be n people
p is the probability that any two of them are ‘friends’
Binomial Poisson Normal
limit p small Limit large n

Random
Graphs
Power Law
networks

Generating Power Law
Distributed Networks
Pseudocode for the growing power law networks:
Start with small number of nodes
add new vertices one by one
each new edge connects to an existing vertex in
proportion to the number of edges that vertex
already displays (i.e. preferentially attach)

Growing Power Law
Distributed Networks
The previous pseudocode is not a unique solution
A variety of other growth dynamics are possible
In the simple case this is a system that extremely
“sensitive to initial conditions”
upstarts who garner early advantage are able to
extend their relative advantage in later periods
for example, imagine you receive a higher interest
rate the more money you have “rich get richer”

ICPSR - Complex Systems Models in the Social Sciences - Lecture 3 - Professor Daniel Martin Katz

More Related Content

What's hot (20)

Viewers also liked (20)

Similar to ICPSR - Complex Systems Models in the Social Sciences - Lecture 3 - Professor Daniel Martin Katz (20)

More from Daniel Katz (20)

Recently uploaded (20)

ICPSR - Complex Systems Models in the Social Sciences - Lecture 3 - Professor Daniel Martin Katz