SlideShare a Scribd company logo
GraphLab: A New Framework For Parallel Machine
Learning
Amir H. Payberah
amir@sics.se
Amirkabir University of Technology
(Tehran Polytechnic)
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 1 / 42
Reminder
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 2 / 42
Data-Parallel Model for Large-Scale Graph Processing
The platforms that have worked well for developing parallel applica-
tions are not necessarily effective for large-scale graph problems.
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 3 / 42
Graph-Parallel Processing
Restricts the types of computation.
New techniques to partition and distribute graphs.
Exploit graph structure.
Executes graph algorithms orders-of-magnitude faster than more
general data-parallel systems.
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 4 / 42
Data-Parallel vs. Graph-Parallel Computation
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 5 / 42
Pregel
Vertex-centric
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 6 / 42
Pregel
Vertex-centric
Bulk Synchronous Parallel (BSP)
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 6 / 42
Pregel
Vertex-centric
Bulk Synchronous Parallel (BSP)
Runs in sequence of iterations (supersteps)
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 6 / 42
Pregel
Vertex-centric
Bulk Synchronous Parallel (BSP)
Runs in sequence of iterations (supersteps)
A vertex in superstep S can:
• reads messages sent to it in superstep S-1.
• sends messages to other vertices: receiving at superstep S+1.
• modifies its state.
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 6 / 42
Pregel Limitations
Inefficient if different regions of the graph converge at different
speed.
Can suffer if one task is more expensive than the others.
Runtime of each phase is determined by the slowest machine.
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 7 / 42
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 8 / 42
Data Model
A directed graph that stores the program state, called data graph.
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 9 / 42
Vertex Scope
The scope of vertex v is the data stored in vertex v, in all adjacent
vertices and adjacent edges.
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 10 / 42
Programming Model (1/3)
Rather than adopting a message passing as in Pregel, GraphLab
allows the user defined function of a vertex to read and modify any
of the data in its scope.
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 11 / 42
Programming Model (2/3)
Update function: user-defined function similar to Compute in Pregel.
Can read and modify the data within the scope of a vertex.
Schedules the future execution of other update functions.
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 12 / 42
Programming Model (3/3)
Sync function: similar to aggregate in Pregel.
Maintains global aggregates.
Performs periodically in the background.
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 13 / 42
Execution Model
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 14 / 42
Execution Model
Each task in the set of tasks T , is a tuple (f, v) consisting of an
update function f and a vertex v.
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 14 / 42
Execution Model
Each task in the set of tasks T , is a tuple (f, v) consisting of an
update function f and a vertex v.
After executing an update function (f, g, · · ·) the modified scope
data in Sv is written back to the data graph.
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 14 / 42
Example: PageRank
GraphLab_PageRank(i)
// compute sum over neighbors
total = 0
foreach(j in in_neighbors(i)):
total = total + R[j] * wji
// update the PageRank
R[i] = 0.15 + total
// trigger neighbors to run again
foreach(j in out_neighbors(i)):
signal vertex-program on j
R[i] = 0.15 +
j∈Nbrs(i)
wjiR[j]
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 15 / 42
Data Consistency (1/3)
Overlapped scopes: race-condition in simultaneous execution of two
update functions.
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 16 / 42
Data Consistency (1/3)
Overlapped scopes: race-condition in simultaneous execution of two
update functions.
Full consistency: during the execution f(v), no other function reads
or modifies data within the v scope.
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 16 / 42
Data Consistency (2/3)
Edge consistency: during the execution f(v), no other function
reads or modifies any of the data on v or any of the edges adja-
cent to v.
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 17 / 42
Data Consistency (3/3)
Vertex consistency: during the execution f(v), no other function
will be applied to v.
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 18 / 42
Sequential Consistency (1/2)
Proving the correctness of a parallel algorithm: sequential consistency
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 19 / 42
Sequential Consistency (1/2)
Proving the correctness of a parallel algorithm: sequential consistency
Sequential consistency: if for every parallel execution, there exists a
sequential execution of update functions that produces an equivalent
result.
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 19 / 42
Sequential Consistency (2/2)
A simple method to achieve serializability is to ensure that the scopes
of concurrently executing update functions do not overlap.
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 20 / 42
Sequential Consistency (2/2)
A simple method to achieve serializability is to ensure that the scopes
of concurrently executing update functions do not overlap.
• The full consistency model is used.
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 20 / 42
Sequential Consistency (2/2)
A simple method to achieve serializability is to ensure that the scopes
of concurrently executing update functions do not overlap.
• The full consistency model is used.
• The edge consistency model is used and update functions do not modify
data in adjacent vertices.
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 20 / 42
Sequential Consistency (2/2)
A simple method to achieve serializability is to ensure that the scopes
of concurrently executing update functions do not overlap.
• The full consistency model is used.
• The edge consistency model is used and update functions do not modify
data in adjacent vertices.
• The vertex consistency model is used and update functions only access
local vertex data.
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 20 / 42
Consistency vs. Parallelism
Consistency vs. Parallelism
[Low, Y., GraphLab: A Distributed Abstraction for Large Scale Machine Learning (Doctoral dissertation, University of
California), 2013.]
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 21 / 42
GraphLab Implementation
Shared memory implementation
Distributed implementation
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 22 / 42
GraphLab Implementation
Shared memory implementation
Distributed implementation
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 22 / 42
Tasks Schedulers (1/2)
In what order should the tasks (vertex-update function pairs) be
called?
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 23 / 42
Tasks Schedulers (1/2)
In what order should the tasks (vertex-update function pairs) be
called?
• A collection of base schedules, e.g., round-robin, and synchronous.
• Set scheduler: enables users to compose custom update schedules.
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 23 / 42
Tasks Schedulers (2/2)
How to add new task in the queue?
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 24 / 42
Tasks Schedulers (2/2)
How to add new task in the queue?
• FIFO: only permits task creation but do not permit task reordering.
• Prioritized: permits task reordering at the cost of increased overhead.
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 24 / 42
Consistency
Implemented in C++ using PThreads for parallelism.
Consistency: read-write lock
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 25 / 42
Consistency
Implemented in C++ using PThreads for parallelism.
Consistency: read-write lock
Vertex consistency
• Central vertex (write-lock)
Edge consistency
• Central vertex (write-lock)
• Adjacent vertices (read-locks)
Full consistency
• Central vertex (write-locks)
• Adjacent vertices (write-locks)
Deadlocks are avoided by acquiring locks sequentially following a
canonical order.
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 25 / 42
GraphLab Implementation
Shared memory implementation
Distributed implementation
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 26 / 42
Distributed Implementation
Graph partitioning
• How to efficiently load, partition and distribute the data graph across
machines?
Consistency
• How to achieve consistency in the distributed setting?
Fault tolerance
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 27 / 42
Graph Partitioning
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 28 / 42
Graph Partitioning - Phase 1 (1/2)
Two-phase partitioning.
Partitioning the data graph into k parts, called atom.
• k number of machines
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 29 / 42
Graph Partitioning - Phase 1 (1/2)
Two-phase partitioning.
Partitioning the data graph into k parts, called atom.
• k number of machines
meta-graph: the graph of atoms (one vertex for each atom).
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 29 / 42
Graph Partitioning - Phase 1 (1/2)
Two-phase partitioning.
Partitioning the data graph into k parts, called atom.
• k number of machines
meta-graph: the graph of atoms (one vertex for each atom).
Atom weight: the amount of data it stores.
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 29 / 42
Graph Partitioning - Phase 1 (1/2)
Two-phase partitioning.
Partitioning the data graph into k parts, called atom.
• k number of machines
meta-graph: the graph of atoms (one vertex for each atom).
Atom weight: the amount of data it stores.
Edge weight: the number of edges crossing the atoms.
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 29 / 42
Graph Partitioning - Phase 1 (2/2)
Each atom is stored as a separate file on a distributed storage system,
e.g., HDFS.
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 30 / 42
Graph Partitioning - Phase 1 (2/2)
Each atom is stored as a separate file on a distributed storage system,
e.g., HDFS.
Each atom file is a simple binary that stores interior and the ghosts
of the partition information.
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 30 / 42
Graph Partitioning - Phase 1 (2/2)
Each atom is stored as a separate file on a distributed storage system,
e.g., HDFS.
Each atom file is a simple binary that stores interior and the ghosts
of the partition information.
Ghost: set of vertices and edges adjacent to the partition boundary.
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 30 / 42
Graph Partitioning - Phase 2
Meta-graph is very small.
A fast balanced partition of the meta-graph over the physical ma-
chines.
Assigning graph atoms to machines.
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 31 / 42
Consistency
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 32 / 42
Consistency
To achieve a serializable parallel execution of a set of dependent
tasks.
• Chromatic engine
• Distributed locking engine
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 33 / 42
Consistency - Chromatic Engine
Construct a vertex coloring: assigns a color to each vertex such that
no adjacent vertices share the same color.
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 34 / 42
Consistency - Chromatic Engine
Construct a vertex coloring: assigns a color to each vertex such that
no adjacent vertices share the same color.
Edge consistency: executing, synchronously, all update tasks asso-
ciated with vertices of the same color before proceeding to the next
color.
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 34 / 42
Consistency - Chromatic Engine
Construct a vertex coloring: assigns a color to each vertex such that
no adjacent vertices share the same color.
Edge consistency: executing, synchronously, all update tasks asso-
ciated with vertices of the same color before proceeding to the next
color.
Full consistency: no vertex shares the same color as any of its dis-
tance two neighbors.
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 34 / 42
Consistency - Chromatic Engine
Construct a vertex coloring: assigns a color to each vertex such that
no adjacent vertices share the same color.
Edge consistency: executing, synchronously, all update tasks asso-
ciated with vertices of the same color before proceeding to the next
color.
Full consistency: no vertex shares the same color as any of its dis-
tance two neighbors.
Vertex consistency: assigning all vertices the same color.
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 34 / 42
Consistency - Distributed Locking Engine
Associating a readers-writer lock with each vertex.
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 35 / 42
Consistency - Distributed Locking Engine
Associating a readers-writer lock with each vertex.
Vertex consistency
• Central vertex (write-lock)
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 35 / 42
Consistency - Distributed Locking Engine
Associating a readers-writer lock with each vertex.
Vertex consistency
• Central vertex (write-lock)
Edge consistency
• Central vertex (write-lock), Adjacent vertices (read-locks)
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 35 / 42
Consistency - Distributed Locking Engine
Associating a readers-writer lock with each vertex.
Vertex consistency
• Central vertex (write-lock)
Edge consistency
• Central vertex (write-lock), Adjacent vertices (read-locks)
Full consistency
• Central vertex (write-locks), Adjacent vertices (write-locks)
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 35 / 42
Consistency - Distributed Locking Engine
Associating a readers-writer lock with each vertex.
Vertex consistency
• Central vertex (write-lock)
Edge consistency
• Central vertex (write-lock), Adjacent vertices (read-locks)
Full consistency
• Central vertex (write-locks), Adjacent vertices (write-locks)
Deadlocks are avoided by acquiring locks sequentially following a
canonical order.
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 35 / 42
Fault Tolerance
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 36 / 42
Fault Tolerance - Synchronous
The systems periodically signals all computation activity to halt.
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 37 / 42
Fault Tolerance - Synchronous
The systems periodically signals all computation activity to halt.
Then synchronizes all caches (ghosts) and saves to disk all data
which has been modified since the last snapshot.
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 37 / 42
Fault Tolerance - Synchronous
The systems periodically signals all computation activity to halt.
Then synchronizes all caches (ghosts) and saves to disk all data
which has been modified since the last snapshot.
Simple, but eliminates the systems advantage of asynchronous com-
putation.
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 37 / 42
Fault Tolerance - Asynchronous
Based on the Chandy-Lamport algorithm.
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 38 / 42
Fault Tolerance - Asynchronous
Based on the Chandy-Lamport algorithm.
The snapshot function is implemented as an update function in
vertices.
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 38 / 42
Fault Tolerance - Asynchronous
Based on the Chandy-Lamport algorithm.
The snapshot function is implemented as an update function in
vertices.
The snapshot update takes priority over all other update functions.
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 38 / 42
Fault Tolerance - Asynchronous
Based on the Chandy-Lamport algorithm.
The snapshot function is implemented as an update function in
vertices.
The snapshot update takes priority over all other update functions.
Edge Consistency is used on all update functions.
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 38 / 42
Fault Tolerance - Asynchronous
Based on the Chandy-Lamport algorithm.
The snapshot function is implemented as an update function in
vertices.
The snapshot update takes priority over all other update functions.
Edge Consistency is used on all update functions.
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 38 / 42
Summary
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 39 / 42
GraphLab Summary
Asynchronous model
Vertex-centric
Communication: distributed shared memory
Three consistency levels: full, edge-level, and vertex-level
Partitioning: two-phase partitioning
Consistency: chromatic engine (graph coloring), distributed lock
engine (reader-writer lock)
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 40 / 42
GraphLab Limitations
Poor performance on Natural graphs.
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 41 / 42
Questions?
Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 42 / 42

More Related Content

PDF
Graph processing - Powergraph and GraphX
PDF
Graph processing - Pregel
PDF
Spark Stream and SEEP
DOC
Uma Sagar_c
PDF
Fault-tolerant topology and routing synthesis for IEEE time-sensitive network...
PDF
Portfolio control version sn_v5
PDF
Advanced Process Monitoring IMF
PPT
Aws100 ch06 thermal
Graph processing - Powergraph and GraphX
Graph processing - Pregel
Spark Stream and SEEP
Uma Sagar_c
Fault-tolerant topology and routing synthesis for IEEE time-sensitive network...
Portfolio control version sn_v5
Advanced Process Monitoring IMF
Aws100 ch06 thermal

What's hot (20)

PPTX
FEA- Introduction
PDF
Design and Implementation of Single Precision Pipelined Floating Point Co-Pro...
PDF
Advanced property tracking Industrial Modeling Framework
PPTX
Learning ANSYS Fluent R19 using modeling a Fluidized Bed with nano particles
PPT
AHF_IDETC_2011_Jie
PDF
Novel high functionality fault tolerant ALU
PDF
J045075661
DOCX
Comparative analysis of cascade h-bridge multilevel Voltage source inverter
PDF
Advanced Data Science with Apache Spark-(Reza Zadeh, Stanford)
PDF
[Capella Day 2019] Model execution and system simulation in Capella
PDF
M.G.Goman, A.V.Khramtsovsky (2008) - Computational framework for investigatio...
PDF
ACES_Journal_February_2012_Paper_07
PDF
Estimation of Base Drag On Supersonic Cruise Missile
PPT
CFD & ANSYS FLUENT
PDF
Iaetsd 128-bit area
PDF
Component Based Control System Design
PDF
(8) modeling of transfer function characteristic of rlc circuit
PDF
Pid output fuzzified water level control in mimo coupled tank system
PDF
IRJET- A Study on Algorithms for FFT Computations
PDF
Supporting Flight Test And Flight Matching
FEA- Introduction
Design and Implementation of Single Precision Pipelined Floating Point Co-Pro...
Advanced property tracking Industrial Modeling Framework
Learning ANSYS Fluent R19 using modeling a Fluidized Bed with nano particles
AHF_IDETC_2011_Jie
Novel high functionality fault tolerant ALU
J045075661
Comparative analysis of cascade h-bridge multilevel Voltage source inverter
Advanced Data Science with Apache Spark-(Reza Zadeh, Stanford)
[Capella Day 2019] Model execution and system simulation in Capella
M.G.Goman, A.V.Khramtsovsky (2008) - Computational framework for investigatio...
ACES_Journal_February_2012_Paper_07
Estimation of Base Drag On Supersonic Cruise Missile
CFD & ANSYS FLUENT
Iaetsd 128-bit area
Component Based Control System Design
(8) modeling of transfer function characteristic of rlc circuit
Pid output fuzzified water level control in mimo coupled tank system
IRJET- A Study on Algorithms for FFT Computations
Supporting Flight Test And Flight Matching
Ad

Viewers also liked (20)

PDF
MapReduce
PDF
Introduction to stream processing
PDF
Process Management - Part3
PDF
Linux Module Programming
PDF
Process Management - Part1
PDF
MegaStore and Spanner
PDF
Introduction to Operating Systems - Part3
PDF
Threads
PDF
Mesos and YARN
PDF
IO Systems
PDF
Main Memory - Part2
PDF
Process Management - Part2
PDF
Introduction to Operating Systems - Part2
PDF
Cloud Computing
PDF
Security
PDF
Protection
PDF
Storage
PDF
CPU Scheduling - Part2
PDF
The Stratosphere Big Data Analytics Platform
PDF
Process Synchronization - Part2
MapReduce
Introduction to stream processing
Process Management - Part3
Linux Module Programming
Process Management - Part1
MegaStore and Spanner
Introduction to Operating Systems - Part3
Threads
Mesos and YARN
IO Systems
Main Memory - Part2
Process Management - Part2
Introduction to Operating Systems - Part2
Cloud Computing
Security
Protection
Storage
CPU Scheduling - Part2
The Stratosphere Big Data Analytics Platform
Process Synchronization - Part2
Ad

Similar to Graph processing - Graphlab (20)

PPT
Raminder kaur presentation_two
PDF
GRAPH MATCHING ALGORITHM FOR TASK ASSIGNMENT PROBLEM
PPTX
Data Parallel and Object Oriented Model
PDF
Report_Altair
PDF
Concurrent Matrix Multiplication on Multi-core Processors
PDF
PDF
HIGH-LEVEL LANGUAGE EXTENSIONS FOR FAST EXECUTION OF PIPELINE-PARALLELIZED CO...
PDF
Parallelization of Graceful Labeling Using Open MP
PDF
A new reverse engineering approach to
PDF
Performance comparison of row per slave and rows set
PDF
Performance comparison of row per slave and rows set per slave method in pvm ...
PPTX
Apache Hadoop India Summit 2011 Keynote talk "Programming Abstractions for Sm...
PDF
Hb2412611268
PDF
Analysis of Impact of Graph Theory in Computer Application
PDF
Grid Computing A Research Monograph 1st Edition Janakiram
PDF
Automatic Compilation Of MATLAB Programs For Synergistic Execution On Heterog...
PDF
Performance Analysis of Parallel Algorithms on Multi-core System using OpenMP
PDF
Hypergraph for consensus optimization
PPT
Hardware and Software parallelism
PDF
BARRACUDA, AN OPEN SOURCE FRAMEWORK FOR PARALLELIZING DIVIDE AND CONQUER ALGO...
Raminder kaur presentation_two
GRAPH MATCHING ALGORITHM FOR TASK ASSIGNMENT PROBLEM
Data Parallel and Object Oriented Model
Report_Altair
Concurrent Matrix Multiplication on Multi-core Processors
HIGH-LEVEL LANGUAGE EXTENSIONS FOR FAST EXECUTION OF PIPELINE-PARALLELIZED CO...
Parallelization of Graceful Labeling Using Open MP
A new reverse engineering approach to
Performance comparison of row per slave and rows set
Performance comparison of row per slave and rows set per slave method in pvm ...
Apache Hadoop India Summit 2011 Keynote talk "Programming Abstractions for Sm...
Hb2412611268
Analysis of Impact of Graph Theory in Computer Application
Grid Computing A Research Monograph 1st Edition Janakiram
Automatic Compilation Of MATLAB Programs For Synergistic Execution On Heterog...
Performance Analysis of Parallel Algorithms on Multi-core System using OpenMP
Hypergraph for consensus optimization
Hardware and Software parallelism
BARRACUDA, AN OPEN SOURCE FRAMEWORK FOR PARALLELIZING DIVIDE AND CONQUER ALGO...

More from Amir Payberah (13)

PDF
P2P Content Distribution Network
PDF
Data Intensive Computing Frameworks
PDF
The Spark Big Data Analytics Platform
PDF
File System Implementation - Part2
PDF
File System Implementation - Part1
PDF
File System Interface
PDF
Virtual Memory - Part2
PDF
Virtual Memory - Part1
PDF
Main Memory - Part1
PDF
Deadlocks
PDF
CPU Scheduling - Part1
PDF
Process Synchronization - Part1
PDF
Introduction to Operating Systems - Part1
P2P Content Distribution Network
Data Intensive Computing Frameworks
The Spark Big Data Analytics Platform
File System Implementation - Part2
File System Implementation - Part1
File System Interface
Virtual Memory - Part2
Virtual Memory - Part1
Main Memory - Part1
Deadlocks
CPU Scheduling - Part1
Process Synchronization - Part1
Introduction to Operating Systems - Part1

Recently uploaded (20)

PDF
Review of recent advances in non-invasive hemoglobin estimation
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Encapsulation theory and applications.pdf
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PPTX
Cloud computing and distributed systems.
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Empathic Computing: Creating Shared Understanding
PDF
Approach and Philosophy of On baking technology
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
Review of recent advances in non-invasive hemoglobin estimation
20250228 LYD VKU AI Blended-Learning.pptx
MYSQL Presentation for SQL database connectivity
Encapsulation theory and applications.pdf
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Cloud computing and distributed systems.
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Empathic Computing: Creating Shared Understanding
Approach and Philosophy of On baking technology
Encapsulation_ Review paper, used for researhc scholars
Chapter 3 Spatial Domain Image Processing.pdf
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Digital-Transformation-Roadmap-for-Companies.pptx
Per capita expenditure prediction using model stacking based on satellite ima...
Programs and apps: productivity, graphics, security and other tools
The Rise and Fall of 3GPP – Time for a Sabbatical?
Advanced methodologies resolving dimensionality complications for autism neur...

Graph processing - Graphlab

  • 1. GraphLab: A New Framework For Parallel Machine Learning Amir H. Payberah amir@sics.se Amirkabir University of Technology (Tehran Polytechnic) Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 1 / 42
  • 2. Reminder Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 2 / 42
  • 3. Data-Parallel Model for Large-Scale Graph Processing The platforms that have worked well for developing parallel applica- tions are not necessarily effective for large-scale graph problems. Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 3 / 42
  • 4. Graph-Parallel Processing Restricts the types of computation. New techniques to partition and distribute graphs. Exploit graph structure. Executes graph algorithms orders-of-magnitude faster than more general data-parallel systems. Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 4 / 42
  • 5. Data-Parallel vs. Graph-Parallel Computation Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 5 / 42
  • 6. Pregel Vertex-centric Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 6 / 42
  • 7. Pregel Vertex-centric Bulk Synchronous Parallel (BSP) Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 6 / 42
  • 8. Pregel Vertex-centric Bulk Synchronous Parallel (BSP) Runs in sequence of iterations (supersteps) Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 6 / 42
  • 9. Pregel Vertex-centric Bulk Synchronous Parallel (BSP) Runs in sequence of iterations (supersteps) A vertex in superstep S can: • reads messages sent to it in superstep S-1. • sends messages to other vertices: receiving at superstep S+1. • modifies its state. Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 6 / 42
  • 10. Pregel Limitations Inefficient if different regions of the graph converge at different speed. Can suffer if one task is more expensive than the others. Runtime of each phase is determined by the slowest machine. Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 7 / 42
  • 11. Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 8 / 42
  • 12. Data Model A directed graph that stores the program state, called data graph. Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 9 / 42
  • 13. Vertex Scope The scope of vertex v is the data stored in vertex v, in all adjacent vertices and adjacent edges. Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 10 / 42
  • 14. Programming Model (1/3) Rather than adopting a message passing as in Pregel, GraphLab allows the user defined function of a vertex to read and modify any of the data in its scope. Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 11 / 42
  • 15. Programming Model (2/3) Update function: user-defined function similar to Compute in Pregel. Can read and modify the data within the scope of a vertex. Schedules the future execution of other update functions. Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 12 / 42
  • 16. Programming Model (3/3) Sync function: similar to aggregate in Pregel. Maintains global aggregates. Performs periodically in the background. Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 13 / 42
  • 17. Execution Model Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 14 / 42
  • 18. Execution Model Each task in the set of tasks T , is a tuple (f, v) consisting of an update function f and a vertex v. Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 14 / 42
  • 19. Execution Model Each task in the set of tasks T , is a tuple (f, v) consisting of an update function f and a vertex v. After executing an update function (f, g, · · ·) the modified scope data in Sv is written back to the data graph. Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 14 / 42
  • 20. Example: PageRank GraphLab_PageRank(i) // compute sum over neighbors total = 0 foreach(j in in_neighbors(i)): total = total + R[j] * wji // update the PageRank R[i] = 0.15 + total // trigger neighbors to run again foreach(j in out_neighbors(i)): signal vertex-program on j R[i] = 0.15 + j∈Nbrs(i) wjiR[j] Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 15 / 42
  • 21. Data Consistency (1/3) Overlapped scopes: race-condition in simultaneous execution of two update functions. Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 16 / 42
  • 22. Data Consistency (1/3) Overlapped scopes: race-condition in simultaneous execution of two update functions. Full consistency: during the execution f(v), no other function reads or modifies data within the v scope. Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 16 / 42
  • 23. Data Consistency (2/3) Edge consistency: during the execution f(v), no other function reads or modifies any of the data on v or any of the edges adja- cent to v. Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 17 / 42
  • 24. Data Consistency (3/3) Vertex consistency: during the execution f(v), no other function will be applied to v. Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 18 / 42
  • 25. Sequential Consistency (1/2) Proving the correctness of a parallel algorithm: sequential consistency Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 19 / 42
  • 26. Sequential Consistency (1/2) Proving the correctness of a parallel algorithm: sequential consistency Sequential consistency: if for every parallel execution, there exists a sequential execution of update functions that produces an equivalent result. Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 19 / 42
  • 27. Sequential Consistency (2/2) A simple method to achieve serializability is to ensure that the scopes of concurrently executing update functions do not overlap. Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 20 / 42
  • 28. Sequential Consistency (2/2) A simple method to achieve serializability is to ensure that the scopes of concurrently executing update functions do not overlap. • The full consistency model is used. Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 20 / 42
  • 29. Sequential Consistency (2/2) A simple method to achieve serializability is to ensure that the scopes of concurrently executing update functions do not overlap. • The full consistency model is used. • The edge consistency model is used and update functions do not modify data in adjacent vertices. Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 20 / 42
  • 30. Sequential Consistency (2/2) A simple method to achieve serializability is to ensure that the scopes of concurrently executing update functions do not overlap. • The full consistency model is used. • The edge consistency model is used and update functions do not modify data in adjacent vertices. • The vertex consistency model is used and update functions only access local vertex data. Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 20 / 42
  • 31. Consistency vs. Parallelism Consistency vs. Parallelism [Low, Y., GraphLab: A Distributed Abstraction for Large Scale Machine Learning (Doctoral dissertation, University of California), 2013.] Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 21 / 42
  • 32. GraphLab Implementation Shared memory implementation Distributed implementation Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 22 / 42
  • 33. GraphLab Implementation Shared memory implementation Distributed implementation Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 22 / 42
  • 34. Tasks Schedulers (1/2) In what order should the tasks (vertex-update function pairs) be called? Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 23 / 42
  • 35. Tasks Schedulers (1/2) In what order should the tasks (vertex-update function pairs) be called? • A collection of base schedules, e.g., round-robin, and synchronous. • Set scheduler: enables users to compose custom update schedules. Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 23 / 42
  • 36. Tasks Schedulers (2/2) How to add new task in the queue? Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 24 / 42
  • 37. Tasks Schedulers (2/2) How to add new task in the queue? • FIFO: only permits task creation but do not permit task reordering. • Prioritized: permits task reordering at the cost of increased overhead. Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 24 / 42
  • 38. Consistency Implemented in C++ using PThreads for parallelism. Consistency: read-write lock Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 25 / 42
  • 39. Consistency Implemented in C++ using PThreads for parallelism. Consistency: read-write lock Vertex consistency • Central vertex (write-lock) Edge consistency • Central vertex (write-lock) • Adjacent vertices (read-locks) Full consistency • Central vertex (write-locks) • Adjacent vertices (write-locks) Deadlocks are avoided by acquiring locks sequentially following a canonical order. Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 25 / 42
  • 40. GraphLab Implementation Shared memory implementation Distributed implementation Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 26 / 42
  • 41. Distributed Implementation Graph partitioning • How to efficiently load, partition and distribute the data graph across machines? Consistency • How to achieve consistency in the distributed setting? Fault tolerance Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 27 / 42
  • 42. Graph Partitioning Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 28 / 42
  • 43. Graph Partitioning - Phase 1 (1/2) Two-phase partitioning. Partitioning the data graph into k parts, called atom. • k number of machines Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 29 / 42
  • 44. Graph Partitioning - Phase 1 (1/2) Two-phase partitioning. Partitioning the data graph into k parts, called atom. • k number of machines meta-graph: the graph of atoms (one vertex for each atom). Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 29 / 42
  • 45. Graph Partitioning - Phase 1 (1/2) Two-phase partitioning. Partitioning the data graph into k parts, called atom. • k number of machines meta-graph: the graph of atoms (one vertex for each atom). Atom weight: the amount of data it stores. Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 29 / 42
  • 46. Graph Partitioning - Phase 1 (1/2) Two-phase partitioning. Partitioning the data graph into k parts, called atom. • k number of machines meta-graph: the graph of atoms (one vertex for each atom). Atom weight: the amount of data it stores. Edge weight: the number of edges crossing the atoms. Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 29 / 42
  • 47. Graph Partitioning - Phase 1 (2/2) Each atom is stored as a separate file on a distributed storage system, e.g., HDFS. Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 30 / 42
  • 48. Graph Partitioning - Phase 1 (2/2) Each atom is stored as a separate file on a distributed storage system, e.g., HDFS. Each atom file is a simple binary that stores interior and the ghosts of the partition information. Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 30 / 42
  • 49. Graph Partitioning - Phase 1 (2/2) Each atom is stored as a separate file on a distributed storage system, e.g., HDFS. Each atom file is a simple binary that stores interior and the ghosts of the partition information. Ghost: set of vertices and edges adjacent to the partition boundary. Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 30 / 42
  • 50. Graph Partitioning - Phase 2 Meta-graph is very small. A fast balanced partition of the meta-graph over the physical ma- chines. Assigning graph atoms to machines. Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 31 / 42
  • 51. Consistency Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 32 / 42
  • 52. Consistency To achieve a serializable parallel execution of a set of dependent tasks. • Chromatic engine • Distributed locking engine Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 33 / 42
  • 53. Consistency - Chromatic Engine Construct a vertex coloring: assigns a color to each vertex such that no adjacent vertices share the same color. Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 34 / 42
  • 54. Consistency - Chromatic Engine Construct a vertex coloring: assigns a color to each vertex such that no adjacent vertices share the same color. Edge consistency: executing, synchronously, all update tasks asso- ciated with vertices of the same color before proceeding to the next color. Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 34 / 42
  • 55. Consistency - Chromatic Engine Construct a vertex coloring: assigns a color to each vertex such that no adjacent vertices share the same color. Edge consistency: executing, synchronously, all update tasks asso- ciated with vertices of the same color before proceeding to the next color. Full consistency: no vertex shares the same color as any of its dis- tance two neighbors. Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 34 / 42
  • 56. Consistency - Chromatic Engine Construct a vertex coloring: assigns a color to each vertex such that no adjacent vertices share the same color. Edge consistency: executing, synchronously, all update tasks asso- ciated with vertices of the same color before proceeding to the next color. Full consistency: no vertex shares the same color as any of its dis- tance two neighbors. Vertex consistency: assigning all vertices the same color. Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 34 / 42
  • 57. Consistency - Distributed Locking Engine Associating a readers-writer lock with each vertex. Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 35 / 42
  • 58. Consistency - Distributed Locking Engine Associating a readers-writer lock with each vertex. Vertex consistency • Central vertex (write-lock) Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 35 / 42
  • 59. Consistency - Distributed Locking Engine Associating a readers-writer lock with each vertex. Vertex consistency • Central vertex (write-lock) Edge consistency • Central vertex (write-lock), Adjacent vertices (read-locks) Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 35 / 42
  • 60. Consistency - Distributed Locking Engine Associating a readers-writer lock with each vertex. Vertex consistency • Central vertex (write-lock) Edge consistency • Central vertex (write-lock), Adjacent vertices (read-locks) Full consistency • Central vertex (write-locks), Adjacent vertices (write-locks) Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 35 / 42
  • 61. Consistency - Distributed Locking Engine Associating a readers-writer lock with each vertex. Vertex consistency • Central vertex (write-lock) Edge consistency • Central vertex (write-lock), Adjacent vertices (read-locks) Full consistency • Central vertex (write-locks), Adjacent vertices (write-locks) Deadlocks are avoided by acquiring locks sequentially following a canonical order. Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 35 / 42
  • 62. Fault Tolerance Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 36 / 42
  • 63. Fault Tolerance - Synchronous The systems periodically signals all computation activity to halt. Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 37 / 42
  • 64. Fault Tolerance - Synchronous The systems periodically signals all computation activity to halt. Then synchronizes all caches (ghosts) and saves to disk all data which has been modified since the last snapshot. Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 37 / 42
  • 65. Fault Tolerance - Synchronous The systems periodically signals all computation activity to halt. Then synchronizes all caches (ghosts) and saves to disk all data which has been modified since the last snapshot. Simple, but eliminates the systems advantage of asynchronous com- putation. Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 37 / 42
  • 66. Fault Tolerance - Asynchronous Based on the Chandy-Lamport algorithm. Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 38 / 42
  • 67. Fault Tolerance - Asynchronous Based on the Chandy-Lamport algorithm. The snapshot function is implemented as an update function in vertices. Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 38 / 42
  • 68. Fault Tolerance - Asynchronous Based on the Chandy-Lamport algorithm. The snapshot function is implemented as an update function in vertices. The snapshot update takes priority over all other update functions. Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 38 / 42
  • 69. Fault Tolerance - Asynchronous Based on the Chandy-Lamport algorithm. The snapshot function is implemented as an update function in vertices. The snapshot update takes priority over all other update functions. Edge Consistency is used on all update functions. Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 38 / 42
  • 70. Fault Tolerance - Asynchronous Based on the Chandy-Lamport algorithm. The snapshot function is implemented as an update function in vertices. The snapshot update takes priority over all other update functions. Edge Consistency is used on all update functions. Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 38 / 42
  • 71. Summary Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 39 / 42
  • 72. GraphLab Summary Asynchronous model Vertex-centric Communication: distributed shared memory Three consistency levels: full, edge-level, and vertex-level Partitioning: two-phase partitioning Consistency: chromatic engine (graph coloring), distributed lock engine (reader-writer lock) Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 40 / 42
  • 73. GraphLab Limitations Poor performance on Natural graphs. Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 41 / 42
  • 74. Questions? Amir H. Payberah (Tehran Polytechnic) GraphLab 1393/9/8 42 / 42