SlideShare a Scribd company logo
Disjoint Sets Data Structure (Chap. 21) A disjoint-set is a collection   ={S 1 , S 2 ,…, S k } of distinct dynamic sets. Each set is identified by a member of the set, called  representative . Disjoint set operations: MAKE-SET( x ): create a new set with only  x . assume  x  is not already in some other set. UNION( x , y ): combine the two sets containing  x  and  y  into one new set. A new representative is selected. FIND-SET( x ): return the representative of the set containing  x .
Multiple Operations Suppose multiple operations: n : #MAKE-SET operations (executed at beginning). m : #MAKE-SET, UNION, FIND-SET operations. m  n , #UNION operation is at most  n -1.
An Application of Disjoint-Set Determine the connected components of an undirected graph. CONNECTED-COMPONENTS(G) for  each vertex  v    V[G] do  MAKE-SET( v ) for  each edge ( u , v )   E[G] do   if  FIND-SET( u )    FIND-SET( v ) then  UNION( u , v ) SAME-COMPONENT( u , v ) if  FIND-SET( u )=FIND-SET( v ) then   return  TRUE else   return  FALSE
Linked-List Implementation Each set as a linked-list, with head and tail, and each node contains value, next node pointer and back-to-representative pointer. Example: MAKE-SET costs  O (1): just create a single element list. FIND-SET costs  O (1): just return back-to-representative pointer.
Linked-lists for two sets head tail g head tail c head Set { c , h , e } Set { f ,  g } UNION of  two Sets e tail c h e f f g h
UNION Implementation A simple implementation: UNION( x , y ) just appends  x  to the end of  y , updates all back-to-representative pointers in  x  to the head of  y . Each UNION takes time linear in the  x ’s length. Suppose  n  MAKE-SET( x i ) operations ( O (1) each) followed by  n -1 UNION UNION( x 1 ,  x 2 ),  O (1),  UNION( x 2 ,  x 3 ),  O (2), … .. UNION( x n- 1 ,  x n ),  O ( n -1) The UNIONs cost 1+2+…+ n -1=  ( n 2 ) So 2 n -1 operations cost   ( n 2 ), average   ( n ) each. Not good!! How to solve it ???
Weighted-Union Heuristic Instead appending  x  to  y , appending the shorter list to the longer list. Associated a length with each list, which indicates how many elements in the list. Result: a sequence of  m  MAKE-SET, UNION, FIND-SET operations,  n  of which are MAKE-SET operations, the running time is O( m + n lg  n ).  Why??? Hints: Count the number of updates to back-to-representative pointer  for any  x  in a set of  n  elements. Consider that each time, the UNION  will at least double the length of  united set, it will take at most lg  n   UNIONS to unite  n  elements. So each  x ’s back-to-representative pointer  can be updated at most lg  n  times.
Disjoint-set Implementation: Forests  Rooted trees, each tree is a set, root is the representative. Each node points to its parent. Root points to itself. d d h e c c Set { c , h , e } Set { f , d } UNION c f h e c c c f
Straightforward Solution Three operations MAKE-SET( x ): create a tree containing  x .  O (1) FIND-SET( x ): follow the chain of parent pointers until to the root.  O (height of  x ’s tree)  UNION( x , y ): let the root of one tree point to the root of the other.  O (1) It is possible that  n -1 UNIONs results in a tree of height  n -1. (just a linear chain of  n  nodes). So  n  FIND-SET operations will cost  O ( n 2 ).
Union by Rank & Path Compression Union by Rank: Each node is associated with a rank, which is the upper bound on the height of the node (i.e., the height of subtree rooted at the node), then when UNION, let the root with smaller rank point to the root with larger rank.  Path Compression: used in FIND-SET( x ) operation, make each node in the path from  x  to the root  directly point to the root. Thus reduce the tree height.
Path Compression f e d c f e d c
Algorithm for Disjoint-Set Forest MAKE-SET( x ) p [ x ]  x rank [ x ]  0 LINK( x , y ) if   rank [ x ]> rank [ y ] then   p [ y ]   x else  p [ x ]   y if   rank [ x ]= rank [ y ] then   rank [ y ]++ FIND-SET( x ) if   x    p [ x ] then   p [ x ]   FIND-SET( p [ x ]) return   p [ x ] Worst case running time for  m  MAKE-SET, UNION, FIND-SET operations is: O ( m  ( n ))  where   ( n )  4. So nearly linear in  m . UNION( x , y ) 1. LINK(FIND-SET( x ),FIND-SET( y ))
Analysis of Union by Rank with Path Compression (by amortized analysis) Discuss the following: A very quickly growing function and its very slowly growing inverse Properties of Ranks Proving time bound of  O ( m  ( n ))  where   ( n ) is a very slowly growing function.
A very quickly growing function and its inverse For integers  k  0 and  j    1,  define A k ( j ): A k ( j )=  j+1  if  k =0 A k -1 ( j +1) ( j )  if  k  1 Where A k -1 0 ( j )= j , A k -1 ( i ) ( j )= A k -1 (A k -1 ( i -1) ( j )) for  i    1. k  is called the  level  of the function and  i  in the above is called  iterations . A k ( j ) strictly increase with  both  j  and  k . Let us see how quick the increase is!!
Quickness of Function A k ( j )’s Increase Lemma 21.2  (Page 510): For any integer  j , A 1 ( j ) =2 j +1. Proof:  By induction on  i , prove A 0 i ( j ) = j + i . So A 1 ( j )= A 0 ( j +1) ( j ) = j +( j +1)=2 j +1. Lemma 21.3  (Page 510): For any integer  j , A 2 ( j ) =2 j +1 ( j +1)-1. Proof: By induction on  i , prove A 1 i ( j ) =2 i ( j +1)-1 A 2 ( j )= A 1 ( j +1) ( j ) = 2 j +1 ( j +1)-1.
How Quick A k ( j ) Increase Let us see A k (1): for k=0,1,2,3,4. A 0 (1)=1+1=2 A 1 (1)=2.1+1=3 A 2 (1)=2 1+1 (1+1)-1=7 A 3 (1)=A 2 (1+1) (1)=A 2 (2) (1)=A 2 (A 2 (1))=A 2 (7)=2 7+1 (7+1)-1=2 8 .8-1=2047 A 4 (1)=A 3 2 (1)=A 3 (A 3 (1)) =A 3 (2047)=A 2 (2048) (2047)  >>  A 2 (2047) =2 2048 .2048-1 >2 2048  =(2 4 ) 512  =(16) 512  >> 10 80   .  (estimated number of atoms in universe)
Inverse of A k ( n ):  ( n )   ( n )=min{ k : A k (1)   n }  (so, A  ( n ) (1)   n  )  ( n )=  0  for 0    n    2 1  n  =3 2  for 4    n    7 3  for 8    n    2047 4  for 2048    n    A 4 (1). Extremely slow increasing function.  ( n )   4 for all practical purposes.
O ( m  ( n )) bound: Property of Ranks Lemma 21.4  (page 511): For all nodes  x ,  rank [ x ]   rank [ p [ x ]], with strict inequality if  x  p [ x ]. Corollary 21.5  (page 511): As we follow the path from any node to the root, the node ranks strictly increase. Lemma 21.6  (page 512): Every node had rank at most  n -1. Proof: rank begins with 0, increase possibly with only LINK operations, which is at most  n -1 time.  In fact, at most   log(n)  .
O ( m  ( n )) bound proof Using amortized analysis (Chap. 17) Using LINK instead UNION (every UNION is done by two FIND-SETs and one LINK) Lemma 21.7  (page 512): Suppose converting a sequence S '  of  m '   MAKE-SET, UNION, and FIND-SET operations into a sequence S of  m  MAKE-SET, LINK, FIND-SET by turning UNION to two FIND-SETs and one LINK, then if S runs in  O ( m  ( n )), then S '  runs in  O ( m '  ( n )).  Proof: because of  m '      m    3 m ' , thus  m = O ( m ').
Potential Function For each node  x , assign a potential function   q ( x ) after  q  operations. Then potential for entire forest,   q =  x  q ( x )  0 =0 at the beginning.   q  will never be negative.  q ( x )=   ( n )  rank [ x ]  if  x  is a root or  rank [ x ]=0.   ( n )-level( x )  rank [ x ]-iter( x )  otherwise.
level( x ) and iter( x ) level( x )=max{ k :  rank [ p [ x ]]  A k ( rank [ x ])} 0   Level( x )<  ( n ), since  rank [ p [ x ]]  rank [ x ]+1=A 0 ( rank [ x ]) and A  ( n )  ( rank [ x ])    A  ( n )  (1)     n  >  rank [ p [ x ]]. iter( x )=max{ i :  rank [ p [ x ]]  A level( x ) ( i ) ( rank [ x ])} 1   iter( x )     rank [ x ], since rank [ p [ x ]]  A level( x ) ( rank [ x ])=A level( x ) (1) ( rank [ x ]) and A level( x ) ( rank [ x ]+1) ( rank [ x ])=A level( x )+1 ( rank [ x ])> rank [ p [ x ]].
Relations among  rank [ p [ x ]], level( x ) and iter( x ) Since  rank [ p [ x ]] monotonically increase over time, in order for  iter( x )  to decrease,  level( x )  must increase. Or say another way, as long as  level( x )  remains unchanged,  iter( x )  must either increase or remains unchanged.
Properties for Potential Function   q ( x )  Lemma 21.8  (page 514): For every node  x , and for all  q , 0     q ( x )      ( n )  rank [ x ] Proof:  if  x  is a root or  rank [ x ]=0, then correct by definition. Suppose  x  is not a root and  rank [ x ]>0,  q ( x )= [  ( n )-level( x )]  rank [ x ]-iter( x )    (  ( n )-(  ( n )-1))    rank [ x ]- rank [ x ] = rank [ x ]- rank [ x ]=0.  q ( x )= [  ( n )-level( x )]  rank [ x ]-iter( x )    [  ( n )-0]  rank [ x ]-1=   ( n )  rank [ x ]-1<  ( n )  rank [ x ]
Potential Changes of Operations Lemma 21.9  (page 515): Let  x  be a node that is not a root, and suppose  q th operation is either LINK or FIND-SET. Then after the  q th operation,   q ( x )      q -1 ( x ). Moreover, if  rank [ x ]    1 and either level[ x ] or iter( x ) changes due to the  q th operation, then   q ( x )      q -1 ( x )-1. Proof:  x  not root     rank [ x ] not change n  not change     ( n ) not change.  If  rank [ x ]=0, then   q ( x ) =   q -1 ( x )=0. suppose  rank [ x ]>0. If level( x ) not change,  If iter( x ) not change,   q ( x ) =   q -1 ( x ), since all keep same If iter( x ) increase, then at lease by 1,   q ( x ) will decrease at least 1. If level( x ) increases (at least by 1), then (  ( n )-level( x ))  rank [ x ] drops at least by  rank [ x ] . Suppose iter( x ) drops, then, the drop is at most  rank [ x ]-1.  so   q ( x ) will drop at least  rank [ x ]-( rank [ x ]-1)=1.  Thus   q ( x )      q -1 ( x )-1.
Amortized Costs of Operations Lemma 21.10  (page 515): The amortized cost of each MAKE-SET operation is  O (1). Proof: create a single node  x  with rank 0, so   q ( x ) =0. no other change to the forest, so   q=   q- 1 . The left is the actual cost, which is  O (1).
Amortized Costs of Operations (cont.) Lemma 21.11  (page 515): The amortized cost of each LINK operation is  O (  ( n )).  Proof: (LINK( x , y ) makes  y  the parent of  x ). Actual cost for LINK operation is  O (1). Considering potential change: Three kinds of nodes:  x ,  y , and the old children of  y . By Lemma 21.9, the potential of  y ’s old children not increase. For  x  (changed to non-root from a root),   q ( x )= [  ( n )-level( x )]  rank [ x ]-iter( x )    [  ( n )-0]  rank [ x ]-1=   ( n )  rank [ x ]-1<  ( n )  rank [ x ]=  q -1 ( x ). For  y ,  rank [ y ] may stay same or increase by 1, so   q ( y )=  ( n )  rank [ y ]=  q -1 ( y ) or   q -1 ( y )+  ( n ). Thus the potential increase due to the LINK operation is at most   ( n ). Thus the amortized cost is  O (1)+ O (  ( n ))= O (  ( n ))
Amortized Costs of Operations (cont.) Lemma 21.12  (page 516): The amortized cost of each FIND-SET operation is  O (  ( n )). Proof: suppose there are  s  nodes in the find path.  The actual cost of FIND-SET is  O ( s ). Root’s potential does not change and no other node’s potential increases (by Lemma 21.9).  At least max(0, s -(  ( n )+2)) nodes on the find path have their potential decrease by at least 1. Thus the amortized cost is at most  O ( s )-( s -(  ( n )+2)) = O (  ( n )).
Proof of Lemma 21.12 (cont.) Proof that at lease max(0, s -(  ( n )+2)) nodes on the find path have their potential decrease by at least 1. Let  x  be a node on the find path: rank [ x ]>0,  followed somewhere by  y  that is not a root,  and level( y )=level( x ) just  before FIND-SET. At most   ( n )+2 nodes do not satisfy: 1th node, root node, the last node  w  for which level(w)=0,1,…,   ( n )-1.  Thus at least max(0, s -(  ( n )+2)) nodes satisfy. Let us fix  x , show  x ’s potential decreases by at least 1.
Proof of Lemma 21.12 (cont.) Let  k =level( x )=level( y ), Just prior to path compression caused by FIND-SET, we have rank [ p [ x ]]  A k (iter( x )) ( rank [ x ]) (by iter( x )’s def.) rank [ p [ y ]]  A k ( rank [ y ]) (by level( y )’s def.) rank [ y ]    rank [ p [ x ]] (since  y  follows  x somewhere ). Let  i =iter( x ) before path compression, we have  rank [ p [ y ]]  A k ( rank [ y ])  A k ( rank [ p [ x ]])  (since A k ( j ) is strictly increasing)  A k (A k (iter( x )) ( rank [ x ])) =A k ( i +1) ( rank [ x ]) After path compression,  rank [ p [ x ]]= rank [ p [ y ]], which not decrease, and  rank [ x ] not change, so  rank [ p [ x ]]  A k ( i +1) ( rank [ x ]). Which means that either iter( x ) increases (to at least  i +1), or level(x) to increase. Thus by Lemma 21.9,   q ( x )      q -1 ( x )-1. that is  x ’s potential decreases by at least 1. As a result, we prove the  lemma 21.12.
Upper bound for Disjoint-sets Theorem 21.13  (page 517): A sequence of  m  MAKE-SET, UNION, FIND-SET operations,  n  of which are MAKE-SET operations, can be performed on a disjoint-set forest with union by rank and path compression in worst cast time  O ( m  ( n )).
Summary  Disjoint set Three operations Different implementations and different costs Forest implementation: Union by rank and path compression Properties: rank, level, iter. Amortized analysis of the operations: Potential function. A k (j) funcion: A k ( j )=  j+1  if  k =0 A k -1 ( j +1) ( j )  if  k  1 Where A k -1 0 ( j )= j , A k -1 ( i ) ( j )= A k -1 (A k -1 ( i -1) ( j )) for  i    1. k  is called the  level  of the function and  i  in the above is called  iterations .  ( n )=min{ k : A k (1)     n }
A typical example using Disjoint Set Kruskal's algorithm  (Minimum Spanning Tree) sort the edges of G in increasing order by length keep a subgraph S of G, initially empty  for each edge e in sorted order  if the endpoints of e are disconnected in S  add e to S  return S  Note:  greedy algorithm Analysis:  The testing whether two endpoints are disconnected  looks like it should be slow (linear time per iteration, or O(mn) total).  in fact, constant time.

More Related Content

PDF
Algorithms Lecture 7: Graph Algorithms
PDF
Daa notes 1
PPTX
Stressen's matrix multiplication
PPTX
Priority Queue in Data Structure
PPT
Backtracking
PPTX
Linked List - Insertion & Deletion
PPTX
CLR AND LALR PARSER
PDF
Design and analysis of algorithms
Algorithms Lecture 7: Graph Algorithms
Daa notes 1
Stressen's matrix multiplication
Priority Queue in Data Structure
Backtracking
Linked List - Insertion & Deletion
CLR AND LALR PARSER
Design and analysis of algorithms

What's hot (20)

PPTX
Density based clustering
PPTX
EX-6-Implement Matrix Multiplication with Hadoop Map Reduce.pptx
PPTX
Dynamic programming
PPTX
Data structure - Graph
PPTX
First order logic
PDF
All pairs shortest path algorithm
PPTX
Lecture optimal binary search tree
PPTX
Stack data structure
PPTX
daa-unit-3-greedy method
PPT
Time complexity
PPTX
N queen problem
PPTX
Support vector machines (svm)
PPTX
Non- Deterministic Algorithms
PPTX
Graph representation
PPTX
Queue ppt
PPT
5.2 mining time series data
PPTX
Birch Algorithm With Solved Example
PPTX
Semantic nets in artificial intelligence
PPTX
Decision tree induction \ Decision Tree Algorithm with Example| Data science
PPT
First order logic
Density based clustering
EX-6-Implement Matrix Multiplication with Hadoop Map Reduce.pptx
Dynamic programming
Data structure - Graph
First order logic
All pairs shortest path algorithm
Lecture optimal binary search tree
Stack data structure
daa-unit-3-greedy method
Time complexity
N queen problem
Support vector machines (svm)
Non- Deterministic Algorithms
Graph representation
Queue ppt
5.2 mining time series data
Birch Algorithm With Solved Example
Semantic nets in artificial intelligence
Decision tree induction \ Decision Tree Algorithm with Example| Data science
First order logic
Ad

Viewers also liked (20)

PDF
07. disjoint set
PPT
Sets and disjoint sets union123
PPTX
Advanced Algorithms #1 - Union/Find on Disjoint-set Data Structures.
PPTX
Set data structure
PPTX
Set Operations - Union Find and Bloom Filters
PPTX
PDF
Time complexity of union find
PDF
Algorithms, Union Find
PDF
17 Disjoint Set Representation
PPT
chapter24.ppt
PPTX
Agile Project Management (Workshop)
PDF
18 Basic Graph Algorithms
PPT
lecture 21
PPTX
Huffman tree
PPTX
Set data structure 2
PPT
Effective Semantic Web Service Composition Framework Based on QoS
PPT
3.9 external sorting
PDF
Graph theory
PPT
Set concepts
PDF
Graph theory
07. disjoint set
Sets and disjoint sets union123
Advanced Algorithms #1 - Union/Find on Disjoint-set Data Structures.
Set data structure
Set Operations - Union Find and Bloom Filters
Time complexity of union find
Algorithms, Union Find
17 Disjoint Set Representation
chapter24.ppt
Agile Project Management (Workshop)
18 Basic Graph Algorithms
lecture 21
Huffman tree
Set data structure 2
Effective Semantic Web Service Composition Framework Based on QoS
3.9 external sorting
Graph theory
Set concepts
Graph theory
Ad

Similar to Disjoint sets (20)

PPTX
Introduction to Neural Networks and Deep Learning from Scratch
PPTX
PPTX
group3ppt-240508074515-Engineering Mathematics II Presentation.pptx
PDF
SlidesL28.pdf
PDF
Open GL 04 linealgos
PDF
PPTX
Integration
PDF
2018 MUMS Fall Course - Statistical Representation of Model Input (EDITED) - ...
DOC
It 05104 digsig_1
PDF
Litvinenko_RWTH_UQ_Seminar_talk.pdf
PDF
Daa chapter7
PDF
GradStudentSeminarSept30
PPT
Functions for Grade 10
PDF
On Spaces of Entire Functions Having Slow Growth Represented By Dirichlet Series
PDF
On the Seidel’s Method, a Stronger Contraction Fixed Point Iterative Method o...
PDF
Z transform
DOCX
Mathsclass xii (exampler problems)
PDF
The multilayer perceptron
PDF
03_AJMS_166_18_RA.pdf
PDF
03_AJMS_166_18_RA.pdf
Introduction to Neural Networks and Deep Learning from Scratch
group3ppt-240508074515-Engineering Mathematics II Presentation.pptx
SlidesL28.pdf
Open GL 04 linealgos
Integration
2018 MUMS Fall Course - Statistical Representation of Model Input (EDITED) - ...
It 05104 digsig_1
Litvinenko_RWTH_UQ_Seminar_talk.pdf
Daa chapter7
GradStudentSeminarSept30
Functions for Grade 10
On Spaces of Entire Functions Having Slow Growth Represented By Dirichlet Series
On the Seidel’s Method, a Stronger Contraction Fixed Point Iterative Method o...
Z transform
Mathsclass xii (exampler problems)
The multilayer perceptron
03_AJMS_166_18_RA.pdf
03_AJMS_166_18_RA.pdf

More from Core Condor (6)

PPT
Weighted graphs
PPT
Red black 2
PPT
Red black 1
PPT
Graph isomorphism
PPT
Red blacktrees
PPT
2 3 tree
Weighted graphs
Red black 2
Red black 1
Graph isomorphism
Red blacktrees
2 3 tree

Recently uploaded (20)

PPTX
Pharma ospi slides which help in ospi learning
PDF
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
Basic Mud Logging Guide for educational purpose
PDF
TR - Agricultural Crops Production NC III.pdf
PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PDF
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PPTX
Institutional Correction lecture only . . .
PDF
VCE English Exam - Section C Student Revision Booklet
PPTX
Introduction to Child Health Nursing – Unit I | Child Health Nursing I | B.Sc...
PDF
Origin of periodic table-Mendeleev’s Periodic-Modern Periodic table
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PPTX
The Healthy Child – Unit II | Child Health Nursing I | B.Sc Nursing 5th Semester
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PPTX
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
Pharma ospi slides which help in ospi learning
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
human mycosis Human fungal infections are called human mycosis..pptx
Basic Mud Logging Guide for educational purpose
TR - Agricultural Crops Production NC III.pdf
Renaissance Architecture: A Journey from Faith to Humanism
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
O5-L3 Freight Transport Ops (International) V1.pdf
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
Institutional Correction lecture only . . .
VCE English Exam - Section C Student Revision Booklet
Introduction to Child Health Nursing – Unit I | Child Health Nursing I | B.Sc...
Origin of periodic table-Mendeleev’s Periodic-Modern Periodic table
Abdominal Access Techniques with Prof. Dr. R K Mishra
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
The Healthy Child – Unit II | Child Health Nursing I | B.Sc Nursing 5th Semester
102 student loan defaulters named and shamed – Is someone you know on the list?
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx

Disjoint sets

  • 1. Disjoint Sets Data Structure (Chap. 21) A disjoint-set is a collection  ={S 1 , S 2 ,…, S k } of distinct dynamic sets. Each set is identified by a member of the set, called representative . Disjoint set operations: MAKE-SET( x ): create a new set with only x . assume x is not already in some other set. UNION( x , y ): combine the two sets containing x and y into one new set. A new representative is selected. FIND-SET( x ): return the representative of the set containing x .
  • 2. Multiple Operations Suppose multiple operations: n : #MAKE-SET operations (executed at beginning). m : #MAKE-SET, UNION, FIND-SET operations. m  n , #UNION operation is at most n -1.
  • 3. An Application of Disjoint-Set Determine the connected components of an undirected graph. CONNECTED-COMPONENTS(G) for each vertex v  V[G] do MAKE-SET( v ) for each edge ( u , v )  E[G] do if FIND-SET( u )  FIND-SET( v ) then UNION( u , v ) SAME-COMPONENT( u , v ) if FIND-SET( u )=FIND-SET( v ) then return TRUE else return FALSE
  • 4. Linked-List Implementation Each set as a linked-list, with head and tail, and each node contains value, next node pointer and back-to-representative pointer. Example: MAKE-SET costs O (1): just create a single element list. FIND-SET costs O (1): just return back-to-representative pointer.
  • 5. Linked-lists for two sets head tail g head tail c head Set { c , h , e } Set { f , g } UNION of two Sets e tail c h e f f g h
  • 6. UNION Implementation A simple implementation: UNION( x , y ) just appends x to the end of y , updates all back-to-representative pointers in x to the head of y . Each UNION takes time linear in the x ’s length. Suppose n MAKE-SET( x i ) operations ( O (1) each) followed by n -1 UNION UNION( x 1 , x 2 ), O (1), UNION( x 2 , x 3 ), O (2), … .. UNION( x n- 1 , x n ), O ( n -1) The UNIONs cost 1+2+…+ n -1=  ( n 2 ) So 2 n -1 operations cost  ( n 2 ), average  ( n ) each. Not good!! How to solve it ???
  • 7. Weighted-Union Heuristic Instead appending x to y , appending the shorter list to the longer list. Associated a length with each list, which indicates how many elements in the list. Result: a sequence of m MAKE-SET, UNION, FIND-SET operations, n of which are MAKE-SET operations, the running time is O( m + n lg n ). Why??? Hints: Count the number of updates to back-to-representative pointer for any x in a set of n elements. Consider that each time, the UNION will at least double the length of united set, it will take at most lg n UNIONS to unite n elements. So each x ’s back-to-representative pointer can be updated at most lg n times.
  • 8. Disjoint-set Implementation: Forests Rooted trees, each tree is a set, root is the representative. Each node points to its parent. Root points to itself. d d h e c c Set { c , h , e } Set { f , d } UNION c f h e c c c f
  • 9. Straightforward Solution Three operations MAKE-SET( x ): create a tree containing x . O (1) FIND-SET( x ): follow the chain of parent pointers until to the root. O (height of x ’s tree) UNION( x , y ): let the root of one tree point to the root of the other. O (1) It is possible that n -1 UNIONs results in a tree of height n -1. (just a linear chain of n nodes). So n FIND-SET operations will cost O ( n 2 ).
  • 10. Union by Rank & Path Compression Union by Rank: Each node is associated with a rank, which is the upper bound on the height of the node (i.e., the height of subtree rooted at the node), then when UNION, let the root with smaller rank point to the root with larger rank. Path Compression: used in FIND-SET( x ) operation, make each node in the path from x to the root directly point to the root. Thus reduce the tree height.
  • 11. Path Compression f e d c f e d c
  • 12. Algorithm for Disjoint-Set Forest MAKE-SET( x ) p [ x ]  x rank [ x ]  0 LINK( x , y ) if rank [ x ]> rank [ y ] then p [ y ]  x else p [ x ]  y if rank [ x ]= rank [ y ] then rank [ y ]++ FIND-SET( x ) if x  p [ x ] then p [ x ]  FIND-SET( p [ x ]) return p [ x ] Worst case running time for m MAKE-SET, UNION, FIND-SET operations is: O ( m  ( n )) where  ( n )  4. So nearly linear in m . UNION( x , y ) 1. LINK(FIND-SET( x ),FIND-SET( y ))
  • 13. Analysis of Union by Rank with Path Compression (by amortized analysis) Discuss the following: A very quickly growing function and its very slowly growing inverse Properties of Ranks Proving time bound of O ( m  ( n )) where  ( n ) is a very slowly growing function.
  • 14. A very quickly growing function and its inverse For integers k  0 and j  1, define A k ( j ): A k ( j )= j+1 if k =0 A k -1 ( j +1) ( j ) if k  1 Where A k -1 0 ( j )= j , A k -1 ( i ) ( j )= A k -1 (A k -1 ( i -1) ( j )) for i  1. k is called the level of the function and i in the above is called iterations . A k ( j ) strictly increase with both j and k . Let us see how quick the increase is!!
  • 15. Quickness of Function A k ( j )’s Increase Lemma 21.2 (Page 510): For any integer j , A 1 ( j ) =2 j +1. Proof: By induction on i , prove A 0 i ( j ) = j + i . So A 1 ( j )= A 0 ( j +1) ( j ) = j +( j +1)=2 j +1. Lemma 21.3 (Page 510): For any integer j , A 2 ( j ) =2 j +1 ( j +1)-1. Proof: By induction on i , prove A 1 i ( j ) =2 i ( j +1)-1 A 2 ( j )= A 1 ( j +1) ( j ) = 2 j +1 ( j +1)-1.
  • 16. How Quick A k ( j ) Increase Let us see A k (1): for k=0,1,2,3,4. A 0 (1)=1+1=2 A 1 (1)=2.1+1=3 A 2 (1)=2 1+1 (1+1)-1=7 A 3 (1)=A 2 (1+1) (1)=A 2 (2) (1)=A 2 (A 2 (1))=A 2 (7)=2 7+1 (7+1)-1=2 8 .8-1=2047 A 4 (1)=A 3 2 (1)=A 3 (A 3 (1)) =A 3 (2047)=A 2 (2048) (2047) >> A 2 (2047) =2 2048 .2048-1 >2 2048 =(2 4 ) 512 =(16) 512 >> 10 80 . (estimated number of atoms in universe)
  • 17. Inverse of A k ( n ):  ( n )  ( n )=min{ k : A k (1)  n } (so, A  ( n ) (1)  n )  ( n )= 0 for 0  n  2 1 n =3 2 for 4  n  7 3 for 8  n  2047 4 for 2048  n  A 4 (1). Extremely slow increasing function.  ( n )  4 for all practical purposes.
  • 18. O ( m  ( n )) bound: Property of Ranks Lemma 21.4 (page 511): For all nodes x , rank [ x ]  rank [ p [ x ]], with strict inequality if x  p [ x ]. Corollary 21.5 (page 511): As we follow the path from any node to the root, the node ranks strictly increase. Lemma 21.6 (page 512): Every node had rank at most n -1. Proof: rank begins with 0, increase possibly with only LINK operations, which is at most n -1 time. In fact, at most  log(n)  .
  • 19. O ( m  ( n )) bound proof Using amortized analysis (Chap. 17) Using LINK instead UNION (every UNION is done by two FIND-SETs and one LINK) Lemma 21.7 (page 512): Suppose converting a sequence S ' of m ' MAKE-SET, UNION, and FIND-SET operations into a sequence S of m MAKE-SET, LINK, FIND-SET by turning UNION to two FIND-SETs and one LINK, then if S runs in O ( m  ( n )), then S ' runs in O ( m '  ( n )). Proof: because of m '  m  3 m ' , thus m = O ( m ').
  • 20. Potential Function For each node x , assign a potential function  q ( x ) after q operations. Then potential for entire forest,  q =  x  q ( x )  0 =0 at the beginning.  q will never be negative.  q ( x )=  ( n )  rank [ x ] if x is a root or rank [ x ]=0.  ( n )-level( x )  rank [ x ]-iter( x ) otherwise.
  • 21. level( x ) and iter( x ) level( x )=max{ k : rank [ p [ x ]]  A k ( rank [ x ])} 0  Level( x )<  ( n ), since rank [ p [ x ]]  rank [ x ]+1=A 0 ( rank [ x ]) and A  ( n ) ( rank [ x ])  A  ( n ) (1)  n > rank [ p [ x ]]. iter( x )=max{ i : rank [ p [ x ]]  A level( x ) ( i ) ( rank [ x ])} 1  iter( x )  rank [ x ], since rank [ p [ x ]]  A level( x ) ( rank [ x ])=A level( x ) (1) ( rank [ x ]) and A level( x ) ( rank [ x ]+1) ( rank [ x ])=A level( x )+1 ( rank [ x ])> rank [ p [ x ]].
  • 22. Relations among rank [ p [ x ]], level( x ) and iter( x ) Since rank [ p [ x ]] monotonically increase over time, in order for iter( x ) to decrease, level( x ) must increase. Or say another way, as long as level( x ) remains unchanged, iter( x ) must either increase or remains unchanged.
  • 23. Properties for Potential Function  q ( x ) Lemma 21.8 (page 514): For every node x , and for all q , 0   q ( x )   ( n )  rank [ x ] Proof: if x is a root or rank [ x ]=0, then correct by definition. Suppose x is not a root and rank [ x ]>0,  q ( x )= [  ( n )-level( x )]  rank [ x ]-iter( x )  (  ( n )-(  ( n )-1))  rank [ x ]- rank [ x ] = rank [ x ]- rank [ x ]=0.  q ( x )= [  ( n )-level( x )]  rank [ x ]-iter( x )  [  ( n )-0]  rank [ x ]-1=  ( n )  rank [ x ]-1<  ( n )  rank [ x ]
  • 24. Potential Changes of Operations Lemma 21.9 (page 515): Let x be a node that is not a root, and suppose q th operation is either LINK or FIND-SET. Then after the q th operation,  q ( x )   q -1 ( x ). Moreover, if rank [ x ]  1 and either level[ x ] or iter( x ) changes due to the q th operation, then  q ( x )   q -1 ( x )-1. Proof: x not root  rank [ x ] not change n not change   ( n ) not change. If rank [ x ]=0, then  q ( x ) =  q -1 ( x )=0. suppose rank [ x ]>0. If level( x ) not change, If iter( x ) not change,  q ( x ) =  q -1 ( x ), since all keep same If iter( x ) increase, then at lease by 1,  q ( x ) will decrease at least 1. If level( x ) increases (at least by 1), then (  ( n )-level( x ))  rank [ x ] drops at least by rank [ x ] . Suppose iter( x ) drops, then, the drop is at most rank [ x ]-1. so  q ( x ) will drop at least rank [ x ]-( rank [ x ]-1)=1. Thus  q ( x )   q -1 ( x )-1.
  • 25. Amortized Costs of Operations Lemma 21.10 (page 515): The amortized cost of each MAKE-SET operation is O (1). Proof: create a single node x with rank 0, so  q ( x ) =0. no other change to the forest, so  q=  q- 1 . The left is the actual cost, which is O (1).
  • 26. Amortized Costs of Operations (cont.) Lemma 21.11 (page 515): The amortized cost of each LINK operation is O (  ( n )). Proof: (LINK( x , y ) makes y the parent of x ). Actual cost for LINK operation is O (1). Considering potential change: Three kinds of nodes: x , y , and the old children of y . By Lemma 21.9, the potential of y ’s old children not increase. For x (changed to non-root from a root),  q ( x )= [  ( n )-level( x )]  rank [ x ]-iter( x )  [  ( n )-0]  rank [ x ]-1=  ( n )  rank [ x ]-1<  ( n )  rank [ x ]=  q -1 ( x ). For y , rank [ y ] may stay same or increase by 1, so  q ( y )=  ( n )  rank [ y ]=  q -1 ( y ) or  q -1 ( y )+  ( n ). Thus the potential increase due to the LINK operation is at most  ( n ). Thus the amortized cost is O (1)+ O (  ( n ))= O (  ( n ))
  • 27. Amortized Costs of Operations (cont.) Lemma 21.12 (page 516): The amortized cost of each FIND-SET operation is O (  ( n )). Proof: suppose there are s nodes in the find path. The actual cost of FIND-SET is O ( s ). Root’s potential does not change and no other node’s potential increases (by Lemma 21.9). At least max(0, s -(  ( n )+2)) nodes on the find path have their potential decrease by at least 1. Thus the amortized cost is at most O ( s )-( s -(  ( n )+2)) = O (  ( n )).
  • 28. Proof of Lemma 21.12 (cont.) Proof that at lease max(0, s -(  ( n )+2)) nodes on the find path have their potential decrease by at least 1. Let x be a node on the find path: rank [ x ]>0, followed somewhere by y that is not a root, and level( y )=level( x ) just before FIND-SET. At most  ( n )+2 nodes do not satisfy: 1th node, root node, the last node w for which level(w)=0,1,…,  ( n )-1. Thus at least max(0, s -(  ( n )+2)) nodes satisfy. Let us fix x , show x ’s potential decreases by at least 1.
  • 29. Proof of Lemma 21.12 (cont.) Let k =level( x )=level( y ), Just prior to path compression caused by FIND-SET, we have rank [ p [ x ]]  A k (iter( x )) ( rank [ x ]) (by iter( x )’s def.) rank [ p [ y ]]  A k ( rank [ y ]) (by level( y )’s def.) rank [ y ]  rank [ p [ x ]] (since y follows x somewhere ). Let i =iter( x ) before path compression, we have rank [ p [ y ]]  A k ( rank [ y ])  A k ( rank [ p [ x ]]) (since A k ( j ) is strictly increasing)  A k (A k (iter( x )) ( rank [ x ])) =A k ( i +1) ( rank [ x ]) After path compression, rank [ p [ x ]]= rank [ p [ y ]], which not decrease, and rank [ x ] not change, so rank [ p [ x ]]  A k ( i +1) ( rank [ x ]). Which means that either iter( x ) increases (to at least i +1), or level(x) to increase. Thus by Lemma 21.9,  q ( x )   q -1 ( x )-1. that is x ’s potential decreases by at least 1. As a result, we prove the lemma 21.12.
  • 30. Upper bound for Disjoint-sets Theorem 21.13 (page 517): A sequence of m MAKE-SET, UNION, FIND-SET operations, n of which are MAKE-SET operations, can be performed on a disjoint-set forest with union by rank and path compression in worst cast time O ( m  ( n )).
  • 31. Summary Disjoint set Three operations Different implementations and different costs Forest implementation: Union by rank and path compression Properties: rank, level, iter. Amortized analysis of the operations: Potential function. A k (j) funcion: A k ( j )= j+1 if k =0 A k -1 ( j +1) ( j ) if k  1 Where A k -1 0 ( j )= j , A k -1 ( i ) ( j )= A k -1 (A k -1 ( i -1) ( j )) for i  1. k is called the level of the function and i in the above is called iterations .  ( n )=min{ k : A k (1)  n }
  • 32. A typical example using Disjoint Set Kruskal's algorithm (Minimum Spanning Tree) sort the edges of G in increasing order by length keep a subgraph S of G, initially empty for each edge e in sorted order if the endpoints of e are disconnected in S add e to S return S Note: greedy algorithm Analysis: The testing whether two endpoints are disconnected looks like it should be slow (linear time per iteration, or O(mn) total). in fact, constant time.