SlideShare a Scribd company logo
distributed graph algorithms
Generalized Architecture For Some Graph Problems
Abhilash Kumar and Saurav Kumar
November 10, 2015
Indian Institute of Technology Kanpur
problem statement
problem statement
∙ Compute all connected sub-graphs of a given graph,
in a distributed environment
2
problem statement
∙ Compute all connected sub-graphs of a given graph,
in a distributed environment
∙ Develop a generalized architecture to solve similar
graph problems
2
motivation
motivation
∙ Exponential number of connected sub-graphs of a
given graph
4
motivation
∙ Exponential number of connected sub-graphs of a
given graph
∙ Necessity to build distributed systems which utilize
the worldwide plethora of distributed resources
4
approach
approach
Insights
∙ Connected sub-graphs exhibit sub-structure
6
approach
Insights
∙ Connected sub-graphs exhibit sub-structure
∙ Extend smaller sub-graphs by adding an outgoing edge to
generate larger sub-graphs
6
approach
Insights
∙ Connected sub-graphs exhibit sub-structure
∙ Extend smaller sub-graphs by adding an outgoing edge to
generate larger sub-graphs
∙ Base cases are sub-graphs represented by all the edges of the
graph
6
approach
Algorithm to compute all connected sub-graphs
∙ Initialize:
7
approach
Algorithm to compute all connected sub-graphs
∙ Initialize:
∙ Queue Q
7
approach
Algorithm to compute all connected sub-graphs
∙ Initialize:
∙ Queue Q
∙ For each edge in G
7
approach
Algorithm to compute all connected sub-graphs
∙ Initialize:
∙ Queue Q
∙ For each edge in G
∙ Create a sub-graph G’ representing the edge
7
approach
Algorithm to compute all connected sub-graphs
∙ Initialize:
∙ Queue Q
∙ For each edge in G
∙ Create a sub-graph G’ representing the edge
∙ Push G’ to Q
7
approach
Algorithm to compute all connected sub-graphs
∙ Initialize:
∙ Queue Q
∙ For each edge in G
∙ Create a sub-graph G’ representing the edge
∙ Push G’ to Q
∙ Process:
7
approach
Algorithm to compute all connected sub-graphs
∙ Initialize:
∙ Queue Q
∙ For each edge in G
∙ Create a sub-graph G’ representing the edge
∙ Push G’ to Q
∙ Process:
∙ while Q is not empty
7
approach
Algorithm to compute all connected sub-graphs
∙ Initialize:
∙ Queue Q
∙ For each edge in G
∙ Create a sub-graph G’ representing the edge
∙ Push G’ to Q
∙ Process:
∙ while Q is not empty
∙ G = Q.pop()
7
approach
Algorithm to compute all connected sub-graphs
∙ Initialize:
∙ Queue Q
∙ For each edge in G
∙ Create a sub-graph G’ representing the edge
∙ Push G’ to Q
∙ Process:
∙ while Q is not empty
∙ G = Q.pop()
∙ Save G
7
approach
Algorithm to compute all connected sub-graphs
∙ Initialize:
∙ Queue Q
∙ For each edge in G
∙ Create a sub-graph G’ representing the edge
∙ Push G’ to Q
∙ Process:
∙ while Q is not empty
∙ G = Q.pop()
∙ Save G
∙ For each outgoing edge E of G
G’ = G U E
if G’ has not been seen yet
Push G’ to Q
7
approach
Algorithm to compute all connected sub-graphs
∙ Initialize:
∙ Queue Q
∙ For each edge in G
∙ Create a sub-graph G’ representing the edge
∙ Push G’ to Q
∙ Process:
∙ while Q is not empty
∙ G = Q.pop()
∙ Save G
∙ For each outgoing edge E of G
G’ = G U E
if G’ has not been seen yet
Push G’ to Q
7
approach
Algorithm to compute all connected sub-graphs
∙ Initialize:
∙ Queue Q
∙ For each edge in G
∙ Create a sub-graph G’ representing the edge
∙ Push G’ to Q
∙ Process:
∙ while Q is not empty
∙ G = Q.pop()
∙ Save G
∙ For each outgoing edge E of G
G’ = G U E
if G’ has not been seen yet
Push G’ to Q
7
approach
Figure: Generating initial sub-graphs from a given graph
8
approach
Figure: Extending a sub-graph to generate new sub-graphs
9
approach
Figure: Consider only unique sub-graphs generated for further processing
10
architecture
architecture
Master-Slave Architecture
∙ Commonly used approach for parallel and distributed
applications
12
architecture
Master-Slave Architecture
∙ Commonly used approach for parallel and distributed
applications
∙ Message passing to communicate over TCP
12
architecture
Master-Slave Architecture
∙ Commonly used approach for parallel and distributed
applications
∙ Message passing to communicate over TCP
∙ Master assigns tasks to slaves and finally collects the results
12
architecture
Master-Slave Architecture
∙ Commonly used approach for parallel and distributed
applications
∙ Message passing to communicate over TCP
∙ Master assigns tasks to slaves and finally collects the results
∙ A Task object represents a sub-graph which contains all
necessary information to process that sub-graph
12
architecture
Master-Slave Architecture
∙ Commonly used approach for parallel and distributed
applications
∙ Message passing to communicate over TCP
∙ Master assigns tasks to slaves and finally collects the results
∙ A Task object represents a sub-graph which contains all
necessary information to process that sub-graph
∙ A slave may request a task from other slaves when its task
queue is empty and processing ends when all task queues are
empty
12
architecture
Task, Queue and Bloom filter
∙ A task has these information:
13
architecture
Task, Queue and Bloom filter
∙ A task has these information:
∙ A list of vertices that are already in the sub-graph
13
architecture
Task, Queue and Bloom filter
∙ A task has these information:
∙ A list of vertices that are already in the sub-graph
∙ A list of edges that can be extended in the next step
13
architecture
Task, Queue and Bloom filter
∙ A task has these information:
∙ A list of vertices that are already in the sub-graph
∙ A list of edges that can be extended in the next step
∙ Task Queue
13
architecture
Task, Queue and Bloom filter
∙ A task has these information:
∙ A list of vertices that are already in the sub-graph
∙ A list of edges that can be extended in the next step
∙ Task Queue
∙ Each slave has a task queue
13
architecture
Task, Queue and Bloom filter
∙ A task has these information:
∙ A list of vertices that are already in the sub-graph
∙ A list of edges that can be extended in the next step
∙ Task Queue
∙ Each slave has a task queue
∙ Slave picks up a task from its task queue and processes it
13
architecture
Task, Queue and Bloom filter
∙ A task has these information:
∙ A list of vertices that are already in the sub-graph
∙ A list of edges that can be extended in the next step
∙ Task Queue
∙ Each slave has a task queue
∙ Slave picks up a task from its task queue and processes it
∙ Newly generated unique tasks are pushed into the task queue
13
architecture
Task, Queue and Bloom filter
∙ A task has these information:
∙ A list of vertices that are already in the sub-graph
∙ A list of edges that can be extended in the next step
∙ Task Queue
∙ Each slave has a task queue
∙ Slave picks up a task from its task queue and processes it
∙ Newly generated unique tasks are pushed into the task queue
∙ Bloom filter
13
architecture
Task, Queue and Bloom filter
∙ A task has these information:
∙ A list of vertices that are already in the sub-graph
∙ A list of edges that can be extended in the next step
∙ Task Queue
∙ Each slave has a task queue
∙ Slave picks up a task from its task queue and processes it
∙ Newly generated unique tasks are pushed into the task queue
∙ Bloom filter
∙ We use Bloom filter to check uniqueness of the newly generated
tasks (i.e. sub-graphs)
13
architecture
Task, Queue and Bloom filter
∙ A task has these information:
∙ A list of vertices that are already in the sub-graph
∙ A list of edges that can be extended in the next step
∙ Task Queue
∙ Each slave has a task queue
∙ Slave picks up a task from its task queue and processes it
∙ Newly generated unique tasks are pushed into the task queue
∙ Bloom filter
∙ We use Bloom filter to check uniqueness of the newly generated
tasks (i.e. sub-graphs)
∙ Bloom filter is also distributed so that none of the servers get
loaded
13
architecture
Bloom Filter Vs Hashing
∙ Used bloom filter because its very space efficient
14
architecture
Bloom Filter Vs Hashing
∙ Used bloom filter because its very space efficient
∙ Space required to get error probability of p is
−n × ln p
(ln 2)2
bits
14
architecture
Bloom Filter Vs Hashing
∙ Used bloom filter because its very space efficient
∙ Space required to get error probability of p is
−n × ln p
(ln 2)2
bits
∙ Error probability can be reduced with very little extra space
14
architecture
Bloom Filter Vs Hashing
∙ Used bloom filter because its very space efficient
∙ Space required to get error probability of p is
−n × ln p
(ln 2)2
bits
∙ Error probability can be reduced with very little extra space
∙ Hashing can be used to make the algorithm deterministic
14
architecture
Bloom Filter Vs Hashing
∙ Used bloom filter because its very space efficient
∙ Space required to get error probability of p is
−n × ln p
(ln 2)2
bits
∙ Error probability can be reduced with very little extra space
∙ Hashing can be used to make the algorithm deterministic
∙ Bloom filter can also be parallelized whereas Hashing cannot be.
14
architecture
How to use this architecture?
∙ Two functions required: initialize and process
15
architecture
How to use this architecture?
∙ Two functions required: initialize and process
∙ Initialize generates initial tasks. Master randomly assigns these
tasks to the slaves.
15
architecture
How to use this architecture?
∙ Two functions required: initialize and process
∙ Initialize generates initial tasks. Master randomly assigns these
tasks to the slaves.
∙ Process defines a procedure that will generate new tasks from a
given task (extend sub-graph in our case)
15
architecture
How to use this architecture?
∙ Two functions required: initialize and process
∙ Initialize generates initial tasks. Master randomly assigns these
tasks to the slaves.
∙ Process defines a procedure that will generate new tasks from a
given task (extend sub-graph in our case)
15
architecture
How to use this architecture?
∙ Two functions required: initialize and process
∙ Initialize generates initial tasks. Master randomly assigns these
tasks to the slaves.
∙ Process defines a procedure that will generate new tasks from a
given task (extend sub-graph in our case)
15
architecture
Fitting the connected sub-graph problem
∙ Initialize creates all the tasks (sub-graphs) with one edge.
16
architecture
Fitting the connected sub-graph problem
∙ Initialize creates all the tasks (sub-graphs) with one edge.
∙ Process takes a connected sub-graph and extends it by adding
all extend-able edges, one at a time
16
simulation
simulation
Simulation for testing
∙ Used 2 machines, say H and L.
18
simulation
Simulation for testing
∙ Used 2 machines, say H and L.
∙ H: 24 core, 200 GB, Xeon E5645 @ 2.40GHz
18
simulation
Simulation for testing
∙ Used 2 machines, say H and L.
∙ H: 24 core, 200 GB, Xeon E5645 @ 2.40GHz
∙ L: 4 core, 8 GB, i5-3230M CPU @ 2.60GHz
18
simulation
Simulation for testing
∙ Used 2 machines, say H and L.
∙ H: 24 core, 200 GB, Xeon E5645 @ 2.40GHz
∙ L: 4 core, 8 GB, i5-3230M CPU @ 2.60GHz
∙ Opened multiple ports (6 on H, 2 on L) to mimic 8 slave servers.
18
simulation
Simulation for testing
∙ Used various combinations of number of slaves on H and L
19
simulation
Simulation for testing
∙ Used various combinations of number of slaves on H and L
∙ Used 2 tree graphs G(14, 13) and G(16, 15): easy to match results
19
simulation
Simulation for testing
∙ Used various combinations of number of slaves on H and L
∙ Used 2 tree graphs G(14, 13) and G(16, 15): easy to match results
∙ Collected data for number of tasks processed by each slave and
number of hash-check queries made by each slave.
19
simulation
Simulation for testing
∙ Used various combinations of number of slaves on H and L
∙ Used 2 tree graphs G(14, 13) and G(16, 15): easy to match results
∙ Collected data for number of tasks processed by each slave and
number of hash-check queries made by each slave.
∙ Collected total running time data for both graphs, including the
cases of network fault.
19
results
results
Figure: Number of hash check queries vs number of slaves for G(14, 13)
21
results
Figure: Distribution of number of tasks processed by slaves for G(14, 13)
22
results
Figure: Distribution of number of tasks processed by slaves for G(14, 13)
23
results
Figure: Distribution of number of tasks processed by slaves for G(14, 13)
24
results
Figure: Number of hash check queries vs number of slaves for G(16, 15)
25
results
Figure: Distribution of number of tasks processed by slaves for G(16, 15)
26
results
Figure: Distribution of number of tasks processed by slaves for G(16, 15)
27
results
Figure: Distribution of number of tasks processed by slaves for G(16, 15)
28
results
Actual Running Time
∙ Network faults happened, specially due to fewer physical
machines
29
results
Actual Running Time
∙ Network faults happened, specially due to fewer physical
machines
∙ The architecture recovers from these faults, but a lot of time is
consumed
29
results
Actual Running Time
∙ Network faults happened, specially due to fewer physical
machines
∙ The architecture recovers from these faults, but a lot of time is
consumed
∙ For G(14, 13), running time ranged from 15s to 91s
29
results
Actual Running Time
∙ Network faults happened, specially due to fewer physical
machines
∙ The architecture recovers from these faults, but a lot of time is
consumed
∙ For G(14, 13), running time ranged from 15s to 91s
∙ For G(15, 14), running time ranged from 255s to 447s
29
results
Actual Running Time
∙ Network faults happened, specially due to fewer physical
machines
∙ The architecture recovers from these faults, but a lot of time is
consumed
∙ For G(14, 13), running time ranged from 15s to 91s
∙ For G(15, 14), running time ranged from 255s to 447s
∙ These are the cases when process function doesn’t do
additional computation per subgraph.
29
results
Figure: Running time when process does addition computation(10ms)
30
advantages
advantages
Advantages
∙ Highly scalable
32
advantages
Advantages
∙ Highly scalable
∙ More slaves can be added easily
32
advantages
Advantages
∙ Highly scalable
∙ More slaves can be added easily
∙ Performance increases with number of slaves
32
advantages
Advantages
∙ Highly scalable
∙ More slaves can be added easily
∙ Performance increases with number of slaves
∙ Even distribution of tasks: efficient machines process more tasks
32
advantages
Advantages
∙ Highly scalable
∙ More slaves can be added easily
∙ Performance increases with number of slaves
∙ Even distribution of tasks: efficient machines process more tasks
∙ Architecture is very reusable
32
advantages
Advantages
∙ Highly scalable
∙ More slaves can be added easily
∙ Performance increases with number of slaves
∙ Even distribution of tasks: efficient machines process more tasks
∙ Architecture is very reusable
∙ Many other problems can be solved using this architecture
32
advantages
Advantages
∙ Highly scalable
∙ More slaves can be added easily
∙ Performance increases with number of slaves
∙ Even distribution of tasks: efficient machines process more tasks
∙ Architecture is very reusable
∙ Many other problems can be solved using this architecture
∙ Only need to provide 2 functions: initialize and process
32
advantages
Advantages
∙ Highly scalable
∙ More slaves can be added easily
∙ Performance increases with number of slaves
∙ Even distribution of tasks: efficient machines process more tasks
∙ Architecture is very reusable
∙ Many other problems can be solved using this architecture
∙ Only need to provide 2 functions: initialize and process
∙ Network fault tolerant
32
advantages
Other problems that can be solved using this
paradigm
∙ Generating all cliques, paths, cycles, sub-trees, spanning
sub-trees
33
advantages
Other problems that can be solved using this
paradigm
∙ Generating all cliques, paths, cycles, sub-trees, spanning
sub-trees
∙ Can also solve few classical NP problems like finding all
maximal cliques and TSP
33
future works
future works
Further improvements
∙ Implement parallelized bloom filter
35
future works
Further improvements
∙ Implement parallelized bloom filter
∙ Parallely solving tasks in a slave (on powerful servers)
35
future works
Further improvements
∙ Implement parallelized bloom filter
∙ Parallely solving tasks in a slave (on powerful servers)
∙ Handle slave/master failures
35
future works
Further improvements
∙ Implement parallelized bloom filter
∙ Parallely solving tasks in a slave (on powerful servers)
∙ Handle slave/master failures
∙ Using file I/O to store task queue for large problems
35
future works
Further improvements
∙ Implement parallelized bloom filter
∙ Parallely solving tasks in a slave (on powerful servers)
∙ Handle slave/master failures
∙ Using file I/O to store task queue for large problems
∙ Exploring this paradigm to solve other problems
35
conclusion
conclusion
Conclusion
∙ The algorithm is very efficient, total computation is not greater
than m * T, where T is the minimum computation required to
find all sub-graphs and m is number of edges.
37
conclusion
Conclusion
∙ The algorithm is very efficient, total computation is not greater
than m * T, where T is the minimum computation required to
find all sub-graphs and m is number of edges.
∙ In practice time complexity is c*T where c is much smaller.
Bound on c can be improved to min(m, log T).
37
conclusion
Conclusion
∙ The algorithm is very efficient, total computation is not greater
than m * T, where T is the minimum computation required to
find all sub-graphs and m is number of edges.
∙ In practice time complexity is c*T where c is much smaller.
Bound on c can be improved to min(m, log T).
∙ As we are interested in finding all connected sub-graph, T better
not be very large.
37
conclusion
Conclusion
∙ The algorithm is very efficient, total computation is not greater
than m * T, where T is the minimum computation required to
find all sub-graphs and m is number of edges.
∙ In practice time complexity is c*T where c is much smaller.
Bound on c can be improved to min(m, log T).
∙ As we are interested in finding all connected sub-graph, T better
not be very large.
∙ The architecture help us solve this problem in much scalable
manner and significantly reduces the time of computation
provided good infrastructure and better implementation.
37
Questions?
Implementation of the algorithm and the architecture available at
github.com/abhilak/DGA
Slides created using Beamer(mtheme) and plot.ly on ShareLaTeX
38
Thank You
39

More Related Content

PDF
Debugging and Profiling C++ Template Metaprograms
PDF
Kotlin functional programming basic@Kotlin TW study group
PDF
Categories for the Working C++ Programmer
PDF
Cilk - An Efficient Multithreaded Runtime System
ODP
Optimized declarative transformation First Eclipse QVTc results
PDF
С++ without new and delete
PPT
Queue implementation
ODP
IIUG 2016 Gathering Informix data into R
Debugging and Profiling C++ Template Metaprograms
Kotlin functional programming basic@Kotlin TW study group
Categories for the Working C++ Programmer
Cilk - An Efficient Multithreaded Runtime System
Optimized declarative transformation First Eclipse QVTc results
С++ without new and delete
Queue implementation
IIUG 2016 Gathering Informix data into R

What's hot (17)

PDF
Python - Lecture 10
PDF
3 little clojure functions
PDF
PDF
Two C++ Tools: Compiler Explorer and Cpp Insights
PPTX
Scilab: Computing Tool For Engineers
PDF
[CCC'21] Evaluation of Work Stealing Algorithms
PDF
Model checker for NTCC
PDF
Garbage collection
PDF
Modeling the Behavior of Threads in the PREEMPT_RT Linux Kernel Using Automata
PDF
Golang dot-testing-lite
DOCX
Net practicals lab mannual
PDF
PDF
Scilab-by-dr-gomez-june2014
PDF
SLE2015: Distributed ATL
PDF
Effective java item 80 and 81
PPTX
Cape2013 scilab-workshop-19Oct13
Python - Lecture 10
3 little clojure functions
Two C++ Tools: Compiler Explorer and Cpp Insights
Scilab: Computing Tool For Engineers
[CCC'21] Evaluation of Work Stealing Algorithms
Model checker for NTCC
Garbage collection
Modeling the Behavior of Threads in the PREEMPT_RT Linux Kernel Using Automata
Golang dot-testing-lite
Net practicals lab mannual
Scilab-by-dr-gomez-june2014
SLE2015: Distributed ATL
Effective java item 80 and 81
Cape2013 scilab-workshop-19Oct13
Ad

Viewers also liked (15)

PDF
18 Basic Graph Algorithms
PPT
1535 graph algorithms
PDF
Topological Sort
PPTX
Graph Traversal Algorithm
PPTX
Fano algorithm
PPTX
Socialite, the Open Source Status Feed Part 2: Managing the Social Graph
PPTX
Graph Traversal Algorithms - Depth First Search Traversal
PPTX
Shannon Fano
PPT
2.2 topological sort 02
PPTX
DFS and BFS
PPT
Graphs bfs dfs
PDF
Graph theory
PPT
Bfs and dfs in data structure
PPTX
Depth first search and breadth first searching
PPT
Compression
18 Basic Graph Algorithms
1535 graph algorithms
Topological Sort
Graph Traversal Algorithm
Fano algorithm
Socialite, the Open Source Status Feed Part 2: Managing the Social Graph
Graph Traversal Algorithms - Depth First Search Traversal
Shannon Fano
2.2 topological sort 02
DFS and BFS
Graphs bfs dfs
Graph theory
Bfs and dfs in data structure
Depth first search and breadth first searching
Compression
Ad

Similar to Distributed Graph Algorithms (20)

PPTX
PREGEL a system for large scale graph processing
PDF
Pregel: A System for Large-Scale Graph Processing
PPT
Graphs.pptGraphs.pptGraphs.pptGraphs.pptGraphs.pptGraphs.ppt
PPTX
MathWorks Interview Lecture
PPTX
Unit ix graph
PDF
Graphhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh.pdf
PPT
Graphs
PPTX
Unit 9 graph
PPTX
Unit 4 dsuc
PPT
PPTX
12_Graph.pptx
PPTX
Data structure
PDF
Introducing Apache Giraph for Large Scale Graph Processing
PPTX
Data Structure and algorithms - Graph1.pptx
PPTX
WEB DEVELOPMET FRONT END WITH ADVANCED RECEAT
PPTX
Data structure Graph PPT ( BFS & DFS ) NOTES
PPTX
Graph_data_structure_information_engineering.pptx
PPTX
Algorithms and data Chapter 3 V Graph.pptx
PDF
Talk on Graph Theory - I
PPTX
6. Graphs
PREGEL a system for large scale graph processing
Pregel: A System for Large-Scale Graph Processing
Graphs.pptGraphs.pptGraphs.pptGraphs.pptGraphs.pptGraphs.ppt
MathWorks Interview Lecture
Unit ix graph
Graphhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh.pdf
Graphs
Unit 9 graph
Unit 4 dsuc
12_Graph.pptx
Data structure
Introducing Apache Giraph for Large Scale Graph Processing
Data Structure and algorithms - Graph1.pptx
WEB DEVELOPMET FRONT END WITH ADVANCED RECEAT
Data structure Graph PPT ( BFS & DFS ) NOTES
Graph_data_structure_information_engineering.pptx
Algorithms and data Chapter 3 V Graph.pptx
Talk on Graph Theory - I
6. Graphs

Recently uploaded (20)

PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PDF
Well-logging-methods_new................
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PDF
R24 SURVEYING LAB MANUAL for civil enggi
PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PPTX
Lecture Notes Electrical Wiring System Components
PPTX
Artificial Intelligence
PPTX
CH1 Production IntroductoryConcepts.pptx
PPT
introduction to datamining and warehousing
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PDF
Model Code of Practice - Construction Work - 21102022 .pdf
PPT
Project quality management in manufacturing
PPTX
Internet of Things (IOT) - A guide to understanding
PPT
Mechanical Engineering MATERIALS Selection
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PDF
composite construction of structures.pdf
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PPTX
bas. eng. economics group 4 presentation 1.pptx
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
Well-logging-methods_new................
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
R24 SURVEYING LAB MANUAL for civil enggi
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
Lecture Notes Electrical Wiring System Components
Artificial Intelligence
CH1 Production IntroductoryConcepts.pptx
introduction to datamining and warehousing
Embodied AI: Ushering in the Next Era of Intelligent Systems
Model Code of Practice - Construction Work - 21102022 .pdf
Project quality management in manufacturing
Internet of Things (IOT) - A guide to understanding
Mechanical Engineering MATERIALS Selection
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
composite construction of structures.pdf
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
bas. eng. economics group 4 presentation 1.pptx

Distributed Graph Algorithms

  • 1. distributed graph algorithms Generalized Architecture For Some Graph Problems Abhilash Kumar and Saurav Kumar November 10, 2015 Indian Institute of Technology Kanpur
  • 3. problem statement ∙ Compute all connected sub-graphs of a given graph, in a distributed environment 2
  • 4. problem statement ∙ Compute all connected sub-graphs of a given graph, in a distributed environment ∙ Develop a generalized architecture to solve similar graph problems 2
  • 6. motivation ∙ Exponential number of connected sub-graphs of a given graph 4
  • 7. motivation ∙ Exponential number of connected sub-graphs of a given graph ∙ Necessity to build distributed systems which utilize the worldwide plethora of distributed resources 4
  • 10. approach Insights ∙ Connected sub-graphs exhibit sub-structure ∙ Extend smaller sub-graphs by adding an outgoing edge to generate larger sub-graphs 6
  • 11. approach Insights ∙ Connected sub-graphs exhibit sub-structure ∙ Extend smaller sub-graphs by adding an outgoing edge to generate larger sub-graphs ∙ Base cases are sub-graphs represented by all the edges of the graph 6
  • 12. approach Algorithm to compute all connected sub-graphs ∙ Initialize: 7
  • 13. approach Algorithm to compute all connected sub-graphs ∙ Initialize: ∙ Queue Q 7
  • 14. approach Algorithm to compute all connected sub-graphs ∙ Initialize: ∙ Queue Q ∙ For each edge in G 7
  • 15. approach Algorithm to compute all connected sub-graphs ∙ Initialize: ∙ Queue Q ∙ For each edge in G ∙ Create a sub-graph G’ representing the edge 7
  • 16. approach Algorithm to compute all connected sub-graphs ∙ Initialize: ∙ Queue Q ∙ For each edge in G ∙ Create a sub-graph G’ representing the edge ∙ Push G’ to Q 7
  • 17. approach Algorithm to compute all connected sub-graphs ∙ Initialize: ∙ Queue Q ∙ For each edge in G ∙ Create a sub-graph G’ representing the edge ∙ Push G’ to Q ∙ Process: 7
  • 18. approach Algorithm to compute all connected sub-graphs ∙ Initialize: ∙ Queue Q ∙ For each edge in G ∙ Create a sub-graph G’ representing the edge ∙ Push G’ to Q ∙ Process: ∙ while Q is not empty 7
  • 19. approach Algorithm to compute all connected sub-graphs ∙ Initialize: ∙ Queue Q ∙ For each edge in G ∙ Create a sub-graph G’ representing the edge ∙ Push G’ to Q ∙ Process: ∙ while Q is not empty ∙ G = Q.pop() 7
  • 20. approach Algorithm to compute all connected sub-graphs ∙ Initialize: ∙ Queue Q ∙ For each edge in G ∙ Create a sub-graph G’ representing the edge ∙ Push G’ to Q ∙ Process: ∙ while Q is not empty ∙ G = Q.pop() ∙ Save G 7
  • 21. approach Algorithm to compute all connected sub-graphs ∙ Initialize: ∙ Queue Q ∙ For each edge in G ∙ Create a sub-graph G’ representing the edge ∙ Push G’ to Q ∙ Process: ∙ while Q is not empty ∙ G = Q.pop() ∙ Save G ∙ For each outgoing edge E of G G’ = G U E if G’ has not been seen yet Push G’ to Q 7
  • 22. approach Algorithm to compute all connected sub-graphs ∙ Initialize: ∙ Queue Q ∙ For each edge in G ∙ Create a sub-graph G’ representing the edge ∙ Push G’ to Q ∙ Process: ∙ while Q is not empty ∙ G = Q.pop() ∙ Save G ∙ For each outgoing edge E of G G’ = G U E if G’ has not been seen yet Push G’ to Q 7
  • 23. approach Algorithm to compute all connected sub-graphs ∙ Initialize: ∙ Queue Q ∙ For each edge in G ∙ Create a sub-graph G’ representing the edge ∙ Push G’ to Q ∙ Process: ∙ while Q is not empty ∙ G = Q.pop() ∙ Save G ∙ For each outgoing edge E of G G’ = G U E if G’ has not been seen yet Push G’ to Q 7
  • 24. approach Figure: Generating initial sub-graphs from a given graph 8
  • 25. approach Figure: Extending a sub-graph to generate new sub-graphs 9
  • 26. approach Figure: Consider only unique sub-graphs generated for further processing 10
  • 28. architecture Master-Slave Architecture ∙ Commonly used approach for parallel and distributed applications 12
  • 29. architecture Master-Slave Architecture ∙ Commonly used approach for parallel and distributed applications ∙ Message passing to communicate over TCP 12
  • 30. architecture Master-Slave Architecture ∙ Commonly used approach for parallel and distributed applications ∙ Message passing to communicate over TCP ∙ Master assigns tasks to slaves and finally collects the results 12
  • 31. architecture Master-Slave Architecture ∙ Commonly used approach for parallel and distributed applications ∙ Message passing to communicate over TCP ∙ Master assigns tasks to slaves and finally collects the results ∙ A Task object represents a sub-graph which contains all necessary information to process that sub-graph 12
  • 32. architecture Master-Slave Architecture ∙ Commonly used approach for parallel and distributed applications ∙ Message passing to communicate over TCP ∙ Master assigns tasks to slaves and finally collects the results ∙ A Task object represents a sub-graph which contains all necessary information to process that sub-graph ∙ A slave may request a task from other slaves when its task queue is empty and processing ends when all task queues are empty 12
  • 33. architecture Task, Queue and Bloom filter ∙ A task has these information: 13
  • 34. architecture Task, Queue and Bloom filter ∙ A task has these information: ∙ A list of vertices that are already in the sub-graph 13
  • 35. architecture Task, Queue and Bloom filter ∙ A task has these information: ∙ A list of vertices that are already in the sub-graph ∙ A list of edges that can be extended in the next step 13
  • 36. architecture Task, Queue and Bloom filter ∙ A task has these information: ∙ A list of vertices that are already in the sub-graph ∙ A list of edges that can be extended in the next step ∙ Task Queue 13
  • 37. architecture Task, Queue and Bloom filter ∙ A task has these information: ∙ A list of vertices that are already in the sub-graph ∙ A list of edges that can be extended in the next step ∙ Task Queue ∙ Each slave has a task queue 13
  • 38. architecture Task, Queue and Bloom filter ∙ A task has these information: ∙ A list of vertices that are already in the sub-graph ∙ A list of edges that can be extended in the next step ∙ Task Queue ∙ Each slave has a task queue ∙ Slave picks up a task from its task queue and processes it 13
  • 39. architecture Task, Queue and Bloom filter ∙ A task has these information: ∙ A list of vertices that are already in the sub-graph ∙ A list of edges that can be extended in the next step ∙ Task Queue ∙ Each slave has a task queue ∙ Slave picks up a task from its task queue and processes it ∙ Newly generated unique tasks are pushed into the task queue 13
  • 40. architecture Task, Queue and Bloom filter ∙ A task has these information: ∙ A list of vertices that are already in the sub-graph ∙ A list of edges that can be extended in the next step ∙ Task Queue ∙ Each slave has a task queue ∙ Slave picks up a task from its task queue and processes it ∙ Newly generated unique tasks are pushed into the task queue ∙ Bloom filter 13
  • 41. architecture Task, Queue and Bloom filter ∙ A task has these information: ∙ A list of vertices that are already in the sub-graph ∙ A list of edges that can be extended in the next step ∙ Task Queue ∙ Each slave has a task queue ∙ Slave picks up a task from its task queue and processes it ∙ Newly generated unique tasks are pushed into the task queue ∙ Bloom filter ∙ We use Bloom filter to check uniqueness of the newly generated tasks (i.e. sub-graphs) 13
  • 42. architecture Task, Queue and Bloom filter ∙ A task has these information: ∙ A list of vertices that are already in the sub-graph ∙ A list of edges that can be extended in the next step ∙ Task Queue ∙ Each slave has a task queue ∙ Slave picks up a task from its task queue and processes it ∙ Newly generated unique tasks are pushed into the task queue ∙ Bloom filter ∙ We use Bloom filter to check uniqueness of the newly generated tasks (i.e. sub-graphs) ∙ Bloom filter is also distributed so that none of the servers get loaded 13
  • 43. architecture Bloom Filter Vs Hashing ∙ Used bloom filter because its very space efficient 14
  • 44. architecture Bloom Filter Vs Hashing ∙ Used bloom filter because its very space efficient ∙ Space required to get error probability of p is −n × ln p (ln 2)2 bits 14
  • 45. architecture Bloom Filter Vs Hashing ∙ Used bloom filter because its very space efficient ∙ Space required to get error probability of p is −n × ln p (ln 2)2 bits ∙ Error probability can be reduced with very little extra space 14
  • 46. architecture Bloom Filter Vs Hashing ∙ Used bloom filter because its very space efficient ∙ Space required to get error probability of p is −n × ln p (ln 2)2 bits ∙ Error probability can be reduced with very little extra space ∙ Hashing can be used to make the algorithm deterministic 14
  • 47. architecture Bloom Filter Vs Hashing ∙ Used bloom filter because its very space efficient ∙ Space required to get error probability of p is −n × ln p (ln 2)2 bits ∙ Error probability can be reduced with very little extra space ∙ Hashing can be used to make the algorithm deterministic ∙ Bloom filter can also be parallelized whereas Hashing cannot be. 14
  • 48. architecture How to use this architecture? ∙ Two functions required: initialize and process 15
  • 49. architecture How to use this architecture? ∙ Two functions required: initialize and process ∙ Initialize generates initial tasks. Master randomly assigns these tasks to the slaves. 15
  • 50. architecture How to use this architecture? ∙ Two functions required: initialize and process ∙ Initialize generates initial tasks. Master randomly assigns these tasks to the slaves. ∙ Process defines a procedure that will generate new tasks from a given task (extend sub-graph in our case) 15
  • 51. architecture How to use this architecture? ∙ Two functions required: initialize and process ∙ Initialize generates initial tasks. Master randomly assigns these tasks to the slaves. ∙ Process defines a procedure that will generate new tasks from a given task (extend sub-graph in our case) 15
  • 52. architecture How to use this architecture? ∙ Two functions required: initialize and process ∙ Initialize generates initial tasks. Master randomly assigns these tasks to the slaves. ∙ Process defines a procedure that will generate new tasks from a given task (extend sub-graph in our case) 15
  • 53. architecture Fitting the connected sub-graph problem ∙ Initialize creates all the tasks (sub-graphs) with one edge. 16
  • 54. architecture Fitting the connected sub-graph problem ∙ Initialize creates all the tasks (sub-graphs) with one edge. ∙ Process takes a connected sub-graph and extends it by adding all extend-able edges, one at a time 16
  • 56. simulation Simulation for testing ∙ Used 2 machines, say H and L. 18
  • 57. simulation Simulation for testing ∙ Used 2 machines, say H and L. ∙ H: 24 core, 200 GB, Xeon E5645 @ 2.40GHz 18
  • 58. simulation Simulation for testing ∙ Used 2 machines, say H and L. ∙ H: 24 core, 200 GB, Xeon E5645 @ 2.40GHz ∙ L: 4 core, 8 GB, i5-3230M CPU @ 2.60GHz 18
  • 59. simulation Simulation for testing ∙ Used 2 machines, say H and L. ∙ H: 24 core, 200 GB, Xeon E5645 @ 2.40GHz ∙ L: 4 core, 8 GB, i5-3230M CPU @ 2.60GHz ∙ Opened multiple ports (6 on H, 2 on L) to mimic 8 slave servers. 18
  • 60. simulation Simulation for testing ∙ Used various combinations of number of slaves on H and L 19
  • 61. simulation Simulation for testing ∙ Used various combinations of number of slaves on H and L ∙ Used 2 tree graphs G(14, 13) and G(16, 15): easy to match results 19
  • 62. simulation Simulation for testing ∙ Used various combinations of number of slaves on H and L ∙ Used 2 tree graphs G(14, 13) and G(16, 15): easy to match results ∙ Collected data for number of tasks processed by each slave and number of hash-check queries made by each slave. 19
  • 63. simulation Simulation for testing ∙ Used various combinations of number of slaves on H and L ∙ Used 2 tree graphs G(14, 13) and G(16, 15): easy to match results ∙ Collected data for number of tasks processed by each slave and number of hash-check queries made by each slave. ∙ Collected total running time data for both graphs, including the cases of network fault. 19
  • 65. results Figure: Number of hash check queries vs number of slaves for G(14, 13) 21
  • 66. results Figure: Distribution of number of tasks processed by slaves for G(14, 13) 22
  • 67. results Figure: Distribution of number of tasks processed by slaves for G(14, 13) 23
  • 68. results Figure: Distribution of number of tasks processed by slaves for G(14, 13) 24
  • 69. results Figure: Number of hash check queries vs number of slaves for G(16, 15) 25
  • 70. results Figure: Distribution of number of tasks processed by slaves for G(16, 15) 26
  • 71. results Figure: Distribution of number of tasks processed by slaves for G(16, 15) 27
  • 72. results Figure: Distribution of number of tasks processed by slaves for G(16, 15) 28
  • 73. results Actual Running Time ∙ Network faults happened, specially due to fewer physical machines 29
  • 74. results Actual Running Time ∙ Network faults happened, specially due to fewer physical machines ∙ The architecture recovers from these faults, but a lot of time is consumed 29
  • 75. results Actual Running Time ∙ Network faults happened, specially due to fewer physical machines ∙ The architecture recovers from these faults, but a lot of time is consumed ∙ For G(14, 13), running time ranged from 15s to 91s 29
  • 76. results Actual Running Time ∙ Network faults happened, specially due to fewer physical machines ∙ The architecture recovers from these faults, but a lot of time is consumed ∙ For G(14, 13), running time ranged from 15s to 91s ∙ For G(15, 14), running time ranged from 255s to 447s 29
  • 77. results Actual Running Time ∙ Network faults happened, specially due to fewer physical machines ∙ The architecture recovers from these faults, but a lot of time is consumed ∙ For G(14, 13), running time ranged from 15s to 91s ∙ For G(15, 14), running time ranged from 255s to 447s ∙ These are the cases when process function doesn’t do additional computation per subgraph. 29
  • 78. results Figure: Running time when process does addition computation(10ms) 30
  • 81. advantages Advantages ∙ Highly scalable ∙ More slaves can be added easily 32
  • 82. advantages Advantages ∙ Highly scalable ∙ More slaves can be added easily ∙ Performance increases with number of slaves 32
  • 83. advantages Advantages ∙ Highly scalable ∙ More slaves can be added easily ∙ Performance increases with number of slaves ∙ Even distribution of tasks: efficient machines process more tasks 32
  • 84. advantages Advantages ∙ Highly scalable ∙ More slaves can be added easily ∙ Performance increases with number of slaves ∙ Even distribution of tasks: efficient machines process more tasks ∙ Architecture is very reusable 32
  • 85. advantages Advantages ∙ Highly scalable ∙ More slaves can be added easily ∙ Performance increases with number of slaves ∙ Even distribution of tasks: efficient machines process more tasks ∙ Architecture is very reusable ∙ Many other problems can be solved using this architecture 32
  • 86. advantages Advantages ∙ Highly scalable ∙ More slaves can be added easily ∙ Performance increases with number of slaves ∙ Even distribution of tasks: efficient machines process more tasks ∙ Architecture is very reusable ∙ Many other problems can be solved using this architecture ∙ Only need to provide 2 functions: initialize and process 32
  • 87. advantages Advantages ∙ Highly scalable ∙ More slaves can be added easily ∙ Performance increases with number of slaves ∙ Even distribution of tasks: efficient machines process more tasks ∙ Architecture is very reusable ∙ Many other problems can be solved using this architecture ∙ Only need to provide 2 functions: initialize and process ∙ Network fault tolerant 32
  • 88. advantages Other problems that can be solved using this paradigm ∙ Generating all cliques, paths, cycles, sub-trees, spanning sub-trees 33
  • 89. advantages Other problems that can be solved using this paradigm ∙ Generating all cliques, paths, cycles, sub-trees, spanning sub-trees ∙ Can also solve few classical NP problems like finding all maximal cliques and TSP 33
  • 91. future works Further improvements ∙ Implement parallelized bloom filter 35
  • 92. future works Further improvements ∙ Implement parallelized bloom filter ∙ Parallely solving tasks in a slave (on powerful servers) 35
  • 93. future works Further improvements ∙ Implement parallelized bloom filter ∙ Parallely solving tasks in a slave (on powerful servers) ∙ Handle slave/master failures 35
  • 94. future works Further improvements ∙ Implement parallelized bloom filter ∙ Parallely solving tasks in a slave (on powerful servers) ∙ Handle slave/master failures ∙ Using file I/O to store task queue for large problems 35
  • 95. future works Further improvements ∙ Implement parallelized bloom filter ∙ Parallely solving tasks in a slave (on powerful servers) ∙ Handle slave/master failures ∙ Using file I/O to store task queue for large problems ∙ Exploring this paradigm to solve other problems 35
  • 97. conclusion Conclusion ∙ The algorithm is very efficient, total computation is not greater than m * T, where T is the minimum computation required to find all sub-graphs and m is number of edges. 37
  • 98. conclusion Conclusion ∙ The algorithm is very efficient, total computation is not greater than m * T, where T is the minimum computation required to find all sub-graphs and m is number of edges. ∙ In practice time complexity is c*T where c is much smaller. Bound on c can be improved to min(m, log T). 37
  • 99. conclusion Conclusion ∙ The algorithm is very efficient, total computation is not greater than m * T, where T is the minimum computation required to find all sub-graphs and m is number of edges. ∙ In practice time complexity is c*T where c is much smaller. Bound on c can be improved to min(m, log T). ∙ As we are interested in finding all connected sub-graph, T better not be very large. 37
  • 100. conclusion Conclusion ∙ The algorithm is very efficient, total computation is not greater than m * T, where T is the minimum computation required to find all sub-graphs and m is number of edges. ∙ In practice time complexity is c*T where c is much smaller. Bound on c can be improved to min(m, log T). ∙ As we are interested in finding all connected sub-graph, T better not be very large. ∙ The architecture help us solve this problem in much scalable manner and significantly reduces the time of computation provided good infrastructure and better implementation. 37
  • 101. Questions? Implementation of the algorithm and the architecture available at github.com/abhilak/DGA Slides created using Beamer(mtheme) and plot.ly on ShareLaTeX 38