SlideShare a Scribd company logo
Networks
           Part II

            Sharad Goel
        Columbia University
Computational Social Science: Lecture 6

            March 1, 2013
Corporate E-mail Communication
[ Adamic & Adar, 2004 ]
via Easley & Kleinberg
Networks/Graphs

             Nodes/vertices
people, organizations, webpages, computers

                  Edges
represent connections between pairs of nodes
Computational Social Science, Lecture 06: Networks, Part II
Distance
Length of the shortest path between two nodes
Distance
Length of the shortest path between two nodes
Breadth-first Search
iteratively explore nodes one layer at a time
# initialize distances
dist = {}
for u in G:
   dist[u] = NA

dist [u0] = 0

d=0
periphery = { u0 }
while len(periphery) > 0:
  # find nodes one step away from the periphery
  next_level = {}
  for u in periphery:
     next_level += { w for w in neighbors[u] if dist[w] == NA }

   # update distances
   d += 1
   for u in next_level:
     dist[u] = d

  # update periphery
  periphery = next_level
BFS @ scale
    undirected network

           Input
 edge list, starting node u0

          Output
Distance to all nodes from u0
BFS @ scale
        undirected network

Input: edge list, distances (u, d)
1. join distances with edge list
2. foreach (u, d, w) output (w, d+1)
   [ also output (u0, 0) ]
3. group by w, and output min d
Connected Components
     undirected network

             Input
            Edge list

            Output
List of nodes for each component
Connected Graph
There is a path between every pair of nodes
Connected Graph
There is a path between every pair of nodes
Connected Component
 A connected subset of nodes that is not
contained in any larger connected subset
Connected Components
             undirected network

1. Select a node u0 that has not yet been assigned
2. BFS starting from u0
3. Record nodes reached by BFS
Consider the global human social network,
with an edge between every pair of friends

       Is this network connected?
Consider the global human social network,
   with an edge between every pair of friends

           Is this network connected?

No – there are people with no (living) friends, who
 are hence isolated from the rest of the network
Consider the global human social network,
with an edge between every pair of friends

Is there a “giant” connected component?
Consider the global human social network,
with an edge between every pair of friends

Is there more than one “giant” component?
Consider the global human social network,
with an edge between every pair of friends

Is there more than one “giant” component?

        No – unlikely to have two
    large disconnected sets of people
Consider the global human social network,
with an edge between every pair of friends

Is there more than one “giant” component?

        No – unlikely to have two
    large disconnected sets of people

        Historically it was more likely
  e.g., pre-Columbian America & Eurasia
Consider the global human social network,
   with an edge between every pair of friends

On average, how far are people from one another?
The Small-world Experiment
            Stanley Milgram, 1967

296 people were randomly selected in Omaha and Wichita

Packages sent to the selected individuals with instructions to
forward to a particular stock broker in Boston through a chain of
people they knew on a first-name basis.
The Small-world Experiment
     Stanley Milgram, 1967

  Of the 296 packages, 232 did not reach target

Of the 64 that did arrive, average path length was 6

            “Six degrees of separation”
Small-world phenomenon

Is “six degrees” big or small?
Small-world phenomenon

navigational vs. topological
The Anatomy of the Facebook Social Graph
J. Ugander, B. Karrer, L. Backstrom, C. Marlow

     721 million users, 69 billion edges
         5 degrees of separation
Edge list  degree distribution
       undirected network

             Input
            Edge list

            Output
       Degree distribution
3
1       2


                5

    4


                              7
            6



                Degree of node u
                # of edges incident on u
Edge list  degree distribution
       undirected network

              Map
         input: (u, w)
     output: (u, w), key := u
     output: (w, u), key := w

            Reduce
      input: u, {w1, …, wk}
          output: u, k
Edge list  degree distribution
       undirected network

              Map
           input: u, k
        identity, key := k

            Reduce
       input: k, {u1, …, um}
          output: k, m
An email network of 130M users
Edges indicate reciprocated communication
An email network of 130M users
Edges indicate reciprocated communication
                (log-log plot)
Clustering
Clustering
Triadic closure

1. Opportunity
2. Incentive
3. Commonality
Counting Triangles
          undirected network

                 Input
              adjacency list

                Output
Number of triangles incident on each node
Counting Triangles
                     In memory


for u in nodes:
   triangles[u] = 0
   for w in neighbors[u]:
      triangles[u] += len(neighbors[w] & neighbors[u])

triangles[u] = triangles[u] / 2
Counting Triangles
                       @ scale


Every node needs to know to which nodes it is connected
                         and
      to which nodes its neighbors are connected
Counting Triangles
        @ scale

         Map
  input: u {w1, …, wk}
  foreach wi:
            output wi u {w1, …, wk}

        Reduce
In memory triangle count
Homophily
the tendency of individuals to associate with similar others


              “birds of a feather flock together”
Birds of a Feather: Homophily in Social Networks
          McPherson, Smith-Lovin, Cook

            race, sex, age, religion, education,
            occupation, social class, behaviors,
              attitudes, abilities, aspirations
Homophily

1. Preference
2. Influence
3. Opportunity
Fantasy Football
Computing Homophily

                  Input
    Edge list, race of each individual

                  Output
   Distribution of race among friends

          White    Black   Latino   Asian
White
Black
Latino
Asian
Computing Homophily

1. Join edges (u, v) by u, demographics (w, race) by w
2. Join edges (u, v, urace) by v, demographics (w, race) by w
Computing Homophily

1. Join edges (u, v) by u, demographics (w, race) by w
2. Join edges (u, v, urace) by v, demographics (w, race) by w
3. Group edges (u, v, urace, vrace) by sorted([urace, vrace])
Computing Homophily

1.   Join edges (u, v) by u, demographics (w, race) by w
2.   Join edges (u, v, urace) by v, demographics (w, race) by w
3.   Group edges (u, v, urace, vrace) by sorted([urace, vrace])
4.   Count edges in each group
5.   Normalize the table
How do ideas and products
 spread through society?
The structure of diffusion


93%   5%        1%          0.3%    0.3%
Computing the structure of diffusion

                Input
          Twitter network
 Time-stamped “adoptions” of 1B URLs

               Output
   Distribution of cascade structures
Computing the structure of diffusion

 We assume v influenced u to adopt link t if
v is the last of u’s contacts to adopt before u

                     2
                             3



                     5




                     9
Computing the structure of diffusion

   Draw a labeled edge from v to u




                          3

                      t
                5
Computing the structure of diffusion

   Group edges by their labels (URL)
Computing the structure of diffusion

Compute the connected components for each
      forest corresponding to a URL
Computing the structure of diffusion

Definition. Two (rooted) trees are isomorphic if they are
identical under a relabeling of the vertices.
x          (x)            (x, x)             ((x))         (x, (x))




Basis. The canonical name c(T) for the one-node tree T is x.

Induction. If T has more than one node, let T1, . . . ,Tk denote the
subtrees of the root indexed such that c(T1) ≤ c(T2) ≤ · · · ≤ c(Tk)
under the lexicographic order. Then the canonical name for T is
(c(T1), . . . ,c(Tk)).
                           Aho et al. [1974]
Computing the structure of diffusion

 Compute the canonical name for each
       tree in the URL forests
Computing the structure of diffusion

Count the number of trees of each type
Computing the structure of diffusion

 We assume v influenced u to adopt link t if
v is the last of u’s contacts to adopt before u

                     2
                             3



                     5




                     9
Computing the structure of diffusion

   Draw a labeled edge from v to u




                          3

                      t
                5
Computing the structure of diffusion


1. Join adoptions (link, u, uts) by u, edges (u, w) by u
2. Join (link, u, uts, w) by (link, w),
   adoptions (link, w, wts) by (link, w)
Computing the structure of diffusion


1. Join adoptions (link, u, uts) by u, edges (u, w) by u
2. Join (link, u, uts, w) by
   (link, w),                             adoptions (link, w, wts) by
   (link, w)
3. Group (link, u, uts, w, wts) by (link, u)
Computing the structure of diffusion


1. Join adoptions (link, u, uts) by u, edges (u, w) by u
2. Join (link, u, uts, w) by (link, w),
   adoptions (link, w, wts) by (link, w)
3. Group (link, u, uts, w, wts) by (link, u)
4. Output unique “parent” edge (link, u, uts, w, wts) for each group

More Related Content

KEY
P2P Supernodes
PDF
Computational Social Science, Lecture 10: Online Experiments
PDF
Computational Social Science, Lecture 08: Counting Fast, Part II
PDF
Computational Social Science, Lecture 07: Counting Fast, Part I
PDF
Computational Social Science, Lecture 11: Regression
PDF
Computational Social Science, Lecture 13: Classification
PDF
Computational Social Science, Lecture 09: Data Wrangling
PPTX
Computational Social Science, Lecture 05: Networks, Part I
P2P Supernodes
Computational Social Science, Lecture 10: Online Experiments
Computational Social Science, Lecture 08: Counting Fast, Part II
Computational Social Science, Lecture 07: Counting Fast, Part I
Computational Social Science, Lecture 11: Regression
Computational Social Science, Lecture 13: Classification
Computational Social Science, Lecture 09: Data Wrangling
Computational Social Science, Lecture 05: Networks, Part I

Viewers also liked (20)

PPTX
Computational Social Science, Lecture 03: Counting at Scale, Part I
PPTX
Computational Social Science, Lecture 04: Counting at Scale, Part II
PDF
Computational Social Science, Lecture 02: An Introduction to Counting
PDF
Modeling Social Data, Lecture 6: Regression, Part 1
PDF
Modeling Social Data, Lecture 2: Introduction to Counting
PDF
Modeling Social Data, Lecture 1: Overview
PPTX
Influential figures in the thriller genre
PPTX
Presentación para los niños de primaria
PPT
Auto Elektrikoen Aurkezpena
DOCX
14.virus lentos no convencionales priones
PDF
лабар6
PDF
лабар9
PDF
Certificate and transcript
ODP
Nuestra salud maria g
PDF
практ1
PDF
RUSA 2016 Final Proof Jan. 21 [146409]
DOCX
Proyecto i.a.v.a
PPTX
Genre research
DOCX
Question 1
PPT
банери
Computational Social Science, Lecture 03: Counting at Scale, Part I
Computational Social Science, Lecture 04: Counting at Scale, Part II
Computational Social Science, Lecture 02: An Introduction to Counting
Modeling Social Data, Lecture 6: Regression, Part 1
Modeling Social Data, Lecture 2: Introduction to Counting
Modeling Social Data, Lecture 1: Overview
Influential figures in the thriller genre
Presentación para los niños de primaria
Auto Elektrikoen Aurkezpena
14.virus lentos no convencionales priones
лабар6
лабар9
Certificate and transcript
Nuestra salud maria g
практ1
RUSA 2016 Final Proof Jan. 21 [146409]
Proyecto i.a.v.a
Genre research
Question 1
банери
Ad

Similar to Computational Social Science, Lecture 06: Networks, Part II (20)

PDF
Document 8 1.pdf
PDF
Community structure in social and biological structures
PDF
Undirected graphs
PPTX
GRAPH THEORY OF NUMBER THEOREM IN DISCRETE MATH
PDF
Exploratory social network analysis with pajek
PDF
Skiena algorithm 2007 lecture10 graph data strctures
PPTX
2013 NodeXL Social Media Network Analysis
PPTX
20121010 marc smith - mapping collections of connections in social media with...
PPT
Graphs in c language
PPTX
20121001 pawcon 2012-marc smith - mapping collections of connections in socia...
PPTX
Slides Chapter10.1 10.2
PPTX
Social Network Analysis - Lecture 4 in Introduction to Computational Social S...
PPT
Measurement and modeling of the web and related data sets
PDF
Community Detection in Social Media
PDF
Jürgens diata12-communities
PDF
Map Coloring and Some of Its Applications
PDF
4.1 network analysis basic
PPT
358 33 powerpoint-slides_13-graphs_chapter-13
PDF
gsm nithya.pdf
Document 8 1.pdf
Community structure in social and biological structures
Undirected graphs
GRAPH THEORY OF NUMBER THEOREM IN DISCRETE MATH
Exploratory social network analysis with pajek
Skiena algorithm 2007 lecture10 graph data strctures
2013 NodeXL Social Media Network Analysis
20121010 marc smith - mapping collections of connections in social media with...
Graphs in c language
20121001 pawcon 2012-marc smith - mapping collections of connections in socia...
Slides Chapter10.1 10.2
Social Network Analysis - Lecture 4 in Introduction to Computational Social S...
Measurement and modeling of the web and related data sets
Community Detection in Social Media
Jürgens diata12-communities
Map Coloring and Some of Its Applications
4.1 network analysis basic
358 33 powerpoint-slides_13-graphs_chapter-13
gsm nithya.pdf
Ad

More from jakehofman (17)

PPTX
Modeling Social Data, Lecture 12: Causality & Experiments, Part 2
PPTX
Modeling Social Data, Lecture 11: Causality and Experiments, Part 1
PDF
Modeling Social Data, Lecture 10: Networks
PDF
Modeling Social Data, Lecture 8: Classification
PDF
Modeling Social Data, Lecture 7: Model complexity and generalization
PDF
Modeling Social Data, Lecture 4: Counting at Scale
PDF
Modeling Social Data, Lecture 3: Data manipulation in R
PDF
Modeling Social Data, Lecture 8: Recommendation Systems
PDF
Modeling Social Data, Lecture 6: Classification with Naive Bayes
PDF
Modeling Social Data, Lecture 3: Counting at Scale
PDF
Modeling Social Data, Lecture 2: Introduction to Counting
PDF
Modeling Social Data, Lecture 1: Case Studies
PDF
NYC Data Science Meetup: Computational Social Science
PDF
Technical Tricks of Vowpal Wabbit
PDF
Data-driven modeling: Lecture 10
PDF
Data-driven modeling: Lecture 09
PDF
Using Data to Understand the Brain
Modeling Social Data, Lecture 12: Causality & Experiments, Part 2
Modeling Social Data, Lecture 11: Causality and Experiments, Part 1
Modeling Social Data, Lecture 10: Networks
Modeling Social Data, Lecture 8: Classification
Modeling Social Data, Lecture 7: Model complexity and generalization
Modeling Social Data, Lecture 4: Counting at Scale
Modeling Social Data, Lecture 3: Data manipulation in R
Modeling Social Data, Lecture 8: Recommendation Systems
Modeling Social Data, Lecture 6: Classification with Naive Bayes
Modeling Social Data, Lecture 3: Counting at Scale
Modeling Social Data, Lecture 2: Introduction to Counting
Modeling Social Data, Lecture 1: Case Studies
NYC Data Science Meetup: Computational Social Science
Technical Tricks of Vowpal Wabbit
Data-driven modeling: Lecture 10
Data-driven modeling: Lecture 09
Using Data to Understand the Brain

Recently uploaded (20)

PDF
VCE English Exam - Section C Student Revision Booklet
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PDF
Computing-Curriculum for Schools in Ghana
PPTX
Presentation on HIE in infants and its manifestations
PDF
Classroom Observation Tools for Teachers
PDF
01-Introduction-to-Information-Management.pdf
PPTX
Pharma ospi slides which help in ospi learning
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PPTX
master seminar digital applications in india
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PPTX
GDM (1) (1).pptx small presentation for students
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PDF
O7-L3 Supply Chain Operations - ICLT Program
PPTX
Cell Structure & Organelles in detailed.
PDF
Chinmaya Tiranga quiz Grand Finale.pdf
PPTX
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE
VCE English Exam - Section C Student Revision Booklet
Pharmacology of Heart Failure /Pharmacotherapy of CHF
Computing-Curriculum for Schools in Ghana
Presentation on HIE in infants and its manifestations
Classroom Observation Tools for Teachers
01-Introduction-to-Information-Management.pdf
Pharma ospi slides which help in ospi learning
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
master seminar digital applications in india
FourierSeries-QuestionsWithAnswers(Part-A).pdf
STATICS OF THE RIGID BODIES Hibbelers.pdf
GDM (1) (1).pptx small presentation for students
102 student loan defaulters named and shamed – Is someone you know on the list?
human mycosis Human fungal infections are called human mycosis..pptx
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
O7-L3 Supply Chain Operations - ICLT Program
Cell Structure & Organelles in detailed.
Chinmaya Tiranga quiz Grand Finale.pdf
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE

Computational Social Science, Lecture 06: Networks, Part II

  • 1. Networks Part II Sharad Goel Columbia University Computational Social Science: Lecture 6 March 1, 2013
  • 2. Corporate E-mail Communication [ Adamic & Adar, 2004 ] via Easley & Kleinberg
  • 3. Networks/Graphs Nodes/vertices people, organizations, webpages, computers Edges represent connections between pairs of nodes
  • 5. Distance Length of the shortest path between two nodes
  • 6. Distance Length of the shortest path between two nodes
  • 7. Breadth-first Search iteratively explore nodes one layer at a time
  • 8. # initialize distances dist = {} for u in G: dist[u] = NA dist [u0] = 0 d=0 periphery = { u0 } while len(periphery) > 0: # find nodes one step away from the periphery next_level = {} for u in periphery: next_level += { w for w in neighbors[u] if dist[w] == NA } # update distances d += 1 for u in next_level: dist[u] = d # update periphery periphery = next_level
  • 9. BFS @ scale undirected network Input edge list, starting node u0 Output Distance to all nodes from u0
  • 10. BFS @ scale undirected network Input: edge list, distances (u, d) 1. join distances with edge list 2. foreach (u, d, w) output (w, d+1) [ also output (u0, 0) ] 3. group by w, and output min d
  • 11. Connected Components undirected network Input Edge list Output List of nodes for each component
  • 12. Connected Graph There is a path between every pair of nodes
  • 13. Connected Graph There is a path between every pair of nodes
  • 14. Connected Component A connected subset of nodes that is not contained in any larger connected subset
  • 15. Connected Components undirected network 1. Select a node u0 that has not yet been assigned 2. BFS starting from u0 3. Record nodes reached by BFS
  • 16. Consider the global human social network, with an edge between every pair of friends Is this network connected?
  • 17. Consider the global human social network, with an edge between every pair of friends Is this network connected? No – there are people with no (living) friends, who are hence isolated from the rest of the network
  • 18. Consider the global human social network, with an edge between every pair of friends Is there a “giant” connected component?
  • 19. Consider the global human social network, with an edge between every pair of friends Is there more than one “giant” component?
  • 20. Consider the global human social network, with an edge between every pair of friends Is there more than one “giant” component? No – unlikely to have two large disconnected sets of people
  • 21. Consider the global human social network, with an edge between every pair of friends Is there more than one “giant” component? No – unlikely to have two large disconnected sets of people Historically it was more likely e.g., pre-Columbian America & Eurasia
  • 22. Consider the global human social network, with an edge between every pair of friends On average, how far are people from one another?
  • 23. The Small-world Experiment Stanley Milgram, 1967 296 people were randomly selected in Omaha and Wichita Packages sent to the selected individuals with instructions to forward to a particular stock broker in Boston through a chain of people they knew on a first-name basis.
  • 24. The Small-world Experiment Stanley Milgram, 1967 Of the 296 packages, 232 did not reach target Of the 64 that did arrive, average path length was 6 “Six degrees of separation”
  • 25. Small-world phenomenon Is “six degrees” big or small?
  • 27. The Anatomy of the Facebook Social Graph J. Ugander, B. Karrer, L. Backstrom, C. Marlow 721 million users, 69 billion edges 5 degrees of separation
  • 28. Edge list  degree distribution undirected network Input Edge list Output Degree distribution
  • 29. 3 1 2 5 4 7 6 Degree of node u # of edges incident on u
  • 30. Edge list  degree distribution undirected network Map input: (u, w) output: (u, w), key := u output: (w, u), key := w Reduce input: u, {w1, …, wk} output: u, k
  • 31. Edge list  degree distribution undirected network Map input: u, k identity, key := k Reduce input: k, {u1, …, um} output: k, m
  • 32. An email network of 130M users Edges indicate reciprocated communication
  • 33. An email network of 130M users Edges indicate reciprocated communication (log-log plot)
  • 36. Triadic closure 1. Opportunity 2. Incentive 3. Commonality
  • 37. Counting Triangles undirected network Input adjacency list Output Number of triangles incident on each node
  • 38. Counting Triangles In memory for u in nodes: triangles[u] = 0 for w in neighbors[u]: triangles[u] += len(neighbors[w] & neighbors[u]) triangles[u] = triangles[u] / 2
  • 39. Counting Triangles @ scale Every node needs to know to which nodes it is connected and to which nodes its neighbors are connected
  • 40. Counting Triangles @ scale Map input: u {w1, …, wk} foreach wi: output wi u {w1, …, wk} Reduce In memory triangle count
  • 41. Homophily the tendency of individuals to associate with similar others “birds of a feather flock together”
  • 42. Birds of a Feather: Homophily in Social Networks McPherson, Smith-Lovin, Cook race, sex, age, religion, education, occupation, social class, behaviors, attitudes, abilities, aspirations
  • 45. Computing Homophily Input Edge list, race of each individual Output Distribution of race among friends White Black Latino Asian White Black Latino Asian
  • 46. Computing Homophily 1. Join edges (u, v) by u, demographics (w, race) by w 2. Join edges (u, v, urace) by v, demographics (w, race) by w
  • 47. Computing Homophily 1. Join edges (u, v) by u, demographics (w, race) by w 2. Join edges (u, v, urace) by v, demographics (w, race) by w 3. Group edges (u, v, urace, vrace) by sorted([urace, vrace])
  • 48. Computing Homophily 1. Join edges (u, v) by u, demographics (w, race) by w 2. Join edges (u, v, urace) by v, demographics (w, race) by w 3. Group edges (u, v, urace, vrace) by sorted([urace, vrace]) 4. Count edges in each group 5. Normalize the table
  • 49. How do ideas and products spread through society?
  • 50. The structure of diffusion 93% 5% 1% 0.3% 0.3%
  • 51. Computing the structure of diffusion Input Twitter network Time-stamped “adoptions” of 1B URLs Output Distribution of cascade structures
  • 52. Computing the structure of diffusion We assume v influenced u to adopt link t if v is the last of u’s contacts to adopt before u 2 3 5 9
  • 53. Computing the structure of diffusion Draw a labeled edge from v to u 3 t 5
  • 54. Computing the structure of diffusion Group edges by their labels (URL)
  • 55. Computing the structure of diffusion Compute the connected components for each forest corresponding to a URL
  • 56. Computing the structure of diffusion Definition. Two (rooted) trees are isomorphic if they are identical under a relabeling of the vertices.
  • 57. x (x) (x, x) ((x)) (x, (x)) Basis. The canonical name c(T) for the one-node tree T is x. Induction. If T has more than one node, let T1, . . . ,Tk denote the subtrees of the root indexed such that c(T1) ≤ c(T2) ≤ · · · ≤ c(Tk) under the lexicographic order. Then the canonical name for T is (c(T1), . . . ,c(Tk)). Aho et al. [1974]
  • 58. Computing the structure of diffusion Compute the canonical name for each tree in the URL forests
  • 59. Computing the structure of diffusion Count the number of trees of each type
  • 60. Computing the structure of diffusion We assume v influenced u to adopt link t if v is the last of u’s contacts to adopt before u 2 3 5 9
  • 61. Computing the structure of diffusion Draw a labeled edge from v to u 3 t 5
  • 62. Computing the structure of diffusion 1. Join adoptions (link, u, uts) by u, edges (u, w) by u 2. Join (link, u, uts, w) by (link, w), adoptions (link, w, wts) by (link, w)
  • 63. Computing the structure of diffusion 1. Join adoptions (link, u, uts) by u, edges (u, w) by u 2. Join (link, u, uts, w) by (link, w), adoptions (link, w, wts) by (link, w) 3. Group (link, u, uts, w, wts) by (link, u)
  • 64. Computing the structure of diffusion 1. Join adoptions (link, u, uts) by u, edges (u, w) by u 2. Join (link, u, uts, w) by (link, w), adoptions (link, w, wts) by (link, w) 3. Group (link, u, uts, w, wts) by (link, u) 4. Output unique “parent” edge (link, u, uts, w, wts) for each group