22.06.2015 DIMA – TU Berlin 1
Fachgebiet Datenbanksysteme und Informationsmanagement
Technische Universität Berlin
http://www.dima.tu-berlin.de/
Hot Topics in Information Management
PowerGraph: Distributed Graph-Parallel
Computation on Natural Graphs
Igor Shevchenko
Mentor: Sebastian Schelter
22.06.2015 DIMA – TU Berlin 2
Agenda
1. Natural Graphs: Properties and Problems;
2. PowerGraph: Vertex Cut and Vertex Programs;
3. GAS Decomposition;
4. Vertex Cut Partitioning;
5. Delta Caching;
6. Applications and Evaluation;
Paper:
Gonzalez at al. PowerGraph: Distributed Graph-
Parallel Computation on Natural Graphs.
22.06.2015 DIMA – TU Berlin 3
■ Natural graphs are graphs derived from real-world
or natural phenomena;
■ Graphs are big: billions of vertices and edges and
rich metadata;
Natural graphs have
Power-Law Degree Distribution
Natural Graphs
22.06.2015 DIMA – TU Berlin 4
Power-Law Degree Distribution
(Andrei Broder et al. Graph structure in the web)
22.06.2015 DIMA – TU Berlin 5
■ We want to analyze natural graphs;
■ Essential for Data Mining and Machine Learning;
Goal
Identify influential people and information;
Identify special nodes and communities;
Model complex data dependencies;
Target ads and products;
Find communities;
Flow scheduling;
22.06.2015 DIMA – TU Berlin 6
■ Existing distributed graph computation systems
perform poorly on natural graphs (Gonzalez et al.
OSDI ’12);
■ The reason is presence of high degree vertices;
Problem
High Degree Vertices: Star-like motif
22.06.2015 DIMA – TU Berlin 7
Possible problems with high degree vertices:
■ Limited single-machine resources;
■ Work imbalance;
■ Sequential computation;
■ Communication costs;
■ Graph partitioning;
Applicable to:
■ Hadoop; GraphLab; Pregel (Piccolo);
Problem Continued
22.06.2015 DIMA – TU Berlin 8
■ High degree vertices can exceed the memory
capacity of a single machine;
■ Store edge meta-data and adjacency information;
Problem: Limited Single-Machine Resources
22.06.2015 DIMA – TU Berlin 9
■ The power-law degree distribution can lead to
significant work imbalance and frequency barriers;
■ For ex. with synchronous execution (Pregel):
Problem: Work Imbalance
22.06.2015 DIMA – TU Berlin 10
■ No parallelization of individual vertex-programs;
■ Edges are processed sequentially;
■ Locking does not scale well to high degree
vertices (for ex. in GraphLab);
Problem: Sequential Computation
Sequentially process edges Asynchronous execution requires heavy locking
22.06.2015 DIMA – TU Berlin 11
■ Generate and send large amount of identical
messages (for ex. in Pregel);
■ This results in communication asymmetry;
Problem: Communication Costs
22.06.2015 DIMA – TU Berlin 12
■ Natural graphs are difficult to partition;
■ Pregel and GraphLab use random (hashed)
partitioning on natural graphs thus maximizing
the network communication;
Problem: Graph Partitioning
22.06.2015 DIMA – TU Berlin 13
■ Natural graphs are difficult to partition;
■ Pregel and GraphLab use random (hashed)
partitioning on natural graphs thus maximizing
the network communication;
Expected edges that are cut
Examples:
■ 10 machines:
■ 100 machines:
Problem: Graph Partitioning Continued
= number of machines
90% of edges cut;
99% of edges cut;
22.06.2015 DIMA – TU Berlin 14
■ GraphLab and Pregel are not well suited for
computations on natural graphs;
Reasons:
■ Challenges of high-degree vertices;
■ Low quality partitioning;
Solution:
■ PowerGraph new abstraction;
In Summary
22.06.2015 DIMA – TU Berlin 15
PowerGraph
22.06.2015 DIMA – TU Berlin 16
Two approaches for partitioning the graph in a
distributed environment:
■ Edge Cut;
■ Vertex Cut;
Partition Techniques
22.06.2015 DIMA – TU Berlin 17
■ Used by Pregel and GraphLab abstractions;
■ Evenly assign vertices to machines;
Edge Cut
22.06.2015 DIMA – TU Berlin 18
■ Used by PowerGraph abstraction;
■ Evenly assign edged to machines;
Vertex Cut The strong point of the paper
4 edges 4 edges
22.06.2015 DIMA – TU Berlin 19
Think like a Vertex
[Malewicz et al. SIGMOD’10]
User-defined Vertex-Program:
1. Runs on each vertex;
2. Interactions are constrained by graph structure;
Pregel and GraphLab also use this concept, where
parallelism is achieved by running multiple vertex
programs simultaneously;
Vertex Programs
22.06.2015 DIMA – TU Berlin 20
■ Vertex cut distributes a single vertex-program
across several machines;
■ Allows to parallelize high-degree vertices;
GAS Decomposition The strong point of the paper
22.06.2015 DIMA – TU Berlin 21
Generalize the vertex-program into three phases:
1. Gather
 Accumulate information about neighborhood;
2. Apply
 Apply accumulated value to center vertex;
3. Scatter
 Update adjacent edges and vertices;
GAS Decomposition
Gather, Apply and Scatter are user-defined functions;
The strong point of the paper
22.06.2015 DIMA – TU Berlin 22
■ Executed on the edges in parallel;
■ Accumulate information about neighborhood;
Gather Phase
22.06.2015 DIMA – TU Berlin 23
■ Executed on the central vertex;
■ Apply accumulated value to center vertex;
Apply Phase
22.06.2015 DIMA – TU Berlin 24
■ Executed on the neighboring vertices in parallel;
■ Update adjacent edges and vertices;
Scatter Phase
22.06.2015 DIMA – TU Berlin 25
■ Vertex-programs that are written using GAS
decomposition will automatically scale to several
machines; How does it work?
GAS Decomposition
22.06.2015 DIMA – TU Berlin 26
GAS in a Distributed Environment
22.06.2015 DIMA – TU Berlin 27
■ Case with 2 machines;
GAS in a Distributed Environment
22.06.2015 DIMA – TU Berlin 28
■ Compute partial sums on each machine;
Gather Phase
22.06.2015 DIMA – TU Berlin 29
■ Send partial sum to the master machine;
■ Master machine computes the total sum;
Gather Phase
22.06.2015 DIMA – TU Berlin 30
■ Apply accumulated value to center vertex;
■ Replicate value to the mirrors;
Apply Phase
22.06.2015 DIMA – TU Berlin 31
■ Update adjacent edges and vertices;
■ Initiate neighboring vertex-programs if necessary;
Scatter Phase
22.06.2015 DIMA – TU Berlin 32
■ During the Gather Phase the partial results are
combined using commutative and associative
user-defined SUM operation;
■ Examples:
sum(a, b): return a + b
sum(a, b): return union(a, b)
sum(a, b): return min(a, b)
■ Also a requirement for Pregel combiners;
■ What if not commutative and associative?
SUM Operation
22.06.2015 DIMA – TU Berlin 33
■ If not commutative and associative sum;
■ Send each edge data to the master machine;
■ Increases communication amount on Gather:
Gather Phase: no partial sums
22.06.2015 DIMA – TU Berlin 34
Vertex Cut Partitioning
The strong point of the paper
22.06.2015 DIMA – TU Berlin 35
Three distributed approaches for Vertex Cut:
■ Random Edge Placement;
■ Coordinated Greedy Edge Placement;
■ Oblivious Greedy Edge Placemen;
Vertex Cut Partitioning
Minimize machines
spanned by each vertex
Minimize communication
and storage overhead
=
22.06.2015 DIMA – TU Berlin 36
■ Randomly assign edges to machines;
Random Edge Placement
22.06.2015 DIMA – TU Berlin 37
Random Edge Placement
■ Randomly assign edges to machines;
22.06.2015 DIMA – TU Berlin 38
Random Edge Placement
■ Randomly assign edges to machines;
■ Edge data is uniquely assigned to one machine
22.06.2015 DIMA – TU Berlin 39
■ Only 3 network communication channels;
■ Can predict network communication usage;
■ Significantly less communication comparing to the
Edge Cut graph placement;
■ Can improve upon random placement!
Communication Overhead
22.06.2015 DIMA – TU Berlin 40
■ Place edges on machines which already has the
vertices in that edge;
Greedy Edge Placement
22.06.2015 DIMA – TU Berlin 41
■ If several choices are possible, assign to the least
loaded machine;
Greedy Edge Placement
22.06.2015 DIMA – TU Berlin 42
■ Greedy Edge Placement is de-randomization;
■ Minimizes the number of machines spanned;
Coordinated Greedy Edge Placement:
■ Requires coordination to place each edge;
■ Maintains global distributed placement table;
■ Slower but produces higher quality cuts;
Oblivious Greedy Edge Placement:
■ Approx. greedy objective without coordination;
■ Faster but produces lower quality cuts;
Greedy Edge Placement
22.06.2015 DIMA – TU Berlin 43
■ Twitter Follower Graph: 41M vertices, 1.4B edges;
■ Oblivious Greedy Edge Placement balances cost
(replication factor) and construction time;
Vertex Cut Partitioning: Comparison
22.06.2015 DIMA – TU Berlin 44
■ Greedy Edge Placement improves computation
performance;
Vertex Cut Partitioning: Comparison
22.06.2015 DIMA – TU Berlin 45
Delta Caching
Execution Modes
22.06.2015 DIMA – TU Berlin 46
■ Vertex-program can be triggered in response to a
change only in a few of its neighbors;
■ In response Gather Phase will accumulate
information about the all neighborhood;
Delta Caching The strong point of the paper
22.06.2015 DIMA – TU Berlin 47
■ Accelerate the process by caching neighborhood
accumulators from previous gather phase;
Delta Caching The strong point of the paper
22.06.2015 DIMA – TU Berlin 48
Delta Caching can speed up:
■ Gather Phase;
■ Scatter Phase;
Requires Abelian Group;
■ sum (+)
■ inverse (−)
Examples:
■ Page Rank – applicable;
■ Graph Coloring – not applicable;
Delta Caching
Commutative and associative
The strong point of the paper
22.06.2015 DIMA – TU Berlin 49
Supports three execution modes:
■ Synchronous: Bulk-Synchronous GAS Phases;
■ Asynchronous: Interleave GAS Phases;
■ Asynchronous Serializable: Prevent neighboring
vertices to run simultaneously;
Different tradeoffs:
■ Algorithm performance;
■ System performance;
■ Determinism;
Execution Modes
22.06.2015 DIMA – TU Berlin 50
Evaluation
22.06.2015 DIMA – TU Berlin 51
PowerGraph on the natural graphs shows:
■ Reduced network communication;
■ Reduced runtime;
■ Reduced storage;
On many examples
Evaluation
PageRank on the Twitter Follower Graph
(40M Users, 1.4 Billion Links)
22.06.2015 DIMA – TU Berlin 52
■ Collaborative Filtering
 Alternating Least Squares
 Stochastic Gradient Descent
 SVD
 Non-negative MF
■ Statistical Inference
 Loopy Belief Propagation
 Max-Product Linear Programs
 Gibbs Sampling
Applicability
■ Graph Analytics
 PageRank
 Triangle Counting
 Shortest Path
 Graph Coloring
 K-core Decomposition
■ Computer Vision
 Image stitching
■ Language Modeling
 LDA
22.06.2015 DIMA – TU Berlin 53
■ Vertex Cut;
■ GAS Decomposition;
■ Delta Caching;
■ Three modes of execution;
 Synchronous;
 Asynchronous;
 Asynchronous + Serializable;
Strong Points of the Paper
22.06.2015 DIMA – TU Berlin 54
■ “In all cases the system is entirely symmetric with no single
coordinating instance or scheduler”;
How do they deal with Synchronous execution?
Evaluation mess:
■ Evaluated Synchronous execution using PageRank;
■ Evaluated Asynchronous execution using GraphColoring;
■ Evaluated Asynchronous+S execution using GraphColoring;
■ Compared PowerGraph with published results again using
PageRank, Triangle Counting but not GraphColoring;
■ Oblivious Greedy Edge Placement is poorly explained;
Weak Points of the Paper
22.06.2015 DIMA – TU Berlin 55
■ Joseph E. Gonzalez, Yucheng Low, Haijie Gu, Danny
Bickson, Carlos Guestrin. PowerGraph: Distributed Graph-
Parallel Computation on Natural Graphs. 10th USENIX
Symposium on Operating Systems Design and
Implementation (OSDI 2012);
■ Malewicz, G., Austern, M. H., Bik, A. J., Dehnert, J., Horn, I.,
Leiser, N., and Czajkowski, G. Pregel: a system for large-
scale graph processing. In SIGMOD (2010).
■ Low, Y., Gonzalez, J., Kyrola, A., Bickson, D., Guestrin, C.,
and Hellerstein, J. M. Distributed GraphLab: A Framework
for Machine Learning and Data Mining in the Cloud. in
PVLDB (2012).
■ http://graphlab.org
References
22.06.2015 DIMA – TU Berlin 56
Questions?
1. Natural Graphs: Properties and Problems;
2. PowerGraph: Vertex Cut and Vertex Programs;
3. GAS Decomposition;
4. Vertex Cut Partitioning;
5. Delta Caching;
6. Applications and Evaluation;
Paper:
Gonzalez at al. PowerGraph: Distributed Graph-
Parallel Computation on Natural Graphs.

More Related Content

PDF
HDRF: Stream-Based Partitioning for Power-Law Graphs
PDF
Graph processing - Powergraph and GraphX
PDF
Graph Convolutional Neural Networks
PPTX
OpenTelemetry For Operators
PDF
Graph neural networks overview
PPTX
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
PPTX
Practical contextual bandits for business
PDF
[오픈소스컨설팅] 프로메테우스 모니터링 살펴보고 구성하기
HDRF: Stream-Based Partitioning for Power-Law Graphs
Graph processing - Powergraph and GraphX
Graph Convolutional Neural Networks
OpenTelemetry For Operators
Graph neural networks overview
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
Practical contextual bandits for business
[오픈소스컨설팅] 프로메테우스 모니터링 살펴보고 구성하기

What's hot (20)

ODP
Apache Spark Internals
PPTX
Graph Neural Network (한국어)
PPTX
Dynamic Rule-based Real-time Market Data Alerts
PDF
Everything You Always Wanted to Know About Kafka’s Rebalance Protocol but Wer...
PDF
Boosting I/O Performance with KVM io_uring
PPTX
Flink vs. Spark
PDF
Large scale overlay networks with ovn: problems and solutions
PPTX
Flink Streaming
PDF
VictoriaMetrics 2023 Roadmap
POTX
Performance Tuning EC2 Instances
PPTX
Where is my bottleneck? Performance troubleshooting in Flink
PDF
Optimizing Servers for High-Throughput and Low-Latency at Dropbox
PDF
Galera explained 3
PDF
[기초개념] Recurrent Neural Network (RNN) 소개
PDF
Fluentd Overview, Now and Then
PDF
Introduction to Redis
PDF
[기초개념] Graph Convolutional Network (GCN)
PDF
Deep Learning for Computer Vision: Attention Models (UPC 2016)
PDF
OSMC 2022 | VictoriaMetrics: scaling to 100 million metrics per second by Ali...
PDF
How Uber scaled its Real Time Infrastructure to Trillion events per day
Apache Spark Internals
Graph Neural Network (한국어)
Dynamic Rule-based Real-time Market Data Alerts
Everything You Always Wanted to Know About Kafka’s Rebalance Protocol but Wer...
Boosting I/O Performance with KVM io_uring
Flink vs. Spark
Large scale overlay networks with ovn: problems and solutions
Flink Streaming
VictoriaMetrics 2023 Roadmap
Performance Tuning EC2 Instances
Where is my bottleneck? Performance troubleshooting in Flink
Optimizing Servers for High-Throughput and Low-Latency at Dropbox
Galera explained 3
[기초개념] Recurrent Neural Network (RNN) 소개
Fluentd Overview, Now and Then
Introduction to Redis
[기초개념] Graph Convolutional Network (GCN)
Deep Learning for Computer Vision: Attention Models (UPC 2016)
OSMC 2022 | VictoriaMetrics: scaling to 100 million metrics per second by Ali...
How Uber scaled its Real Time Infrastructure to Trillion events per day
Ad

Viewers also liked (12)

PDF
Machine Learning in the Cloud with GraphLab
PDF
Graphlab under the hood
PDF
GraphChi big graph processing
PDF
Joey gonzalez, graph lab, m lconf 2013
PPTX
CS267_Graph_Lab
PDF
PDF
Ling liu part 01:big graph processing
PPTX
Large-Scale Graph Computation on Just a PC: Aapo Kyrola Ph.D. thesis defense
PDF
Martin Junghans – Gradoop: Scalable Graph Analytics with Apache Flink
PDF
Jeff Bradshaw, Founder, Adaptris
PDF
Graph processing - Graphlab
PPTX
Machine Learning with GraphLab Create
Machine Learning in the Cloud with GraphLab
Graphlab under the hood
GraphChi big graph processing
Joey gonzalez, graph lab, m lconf 2013
CS267_Graph_Lab
Ling liu part 01:big graph processing
Large-Scale Graph Computation on Just a PC: Aapo Kyrola Ph.D. thesis defense
Martin Junghans – Gradoop: Scalable Graph Analytics with Apache Flink
Jeff Bradshaw, Founder, Adaptris
Graph processing - Graphlab
Machine Learning with GraphLab Create
Ad

Similar to PowerGraph (20)

PDF
CVPR 2018 Paper Reading MobileNet V2
PDF
CHAPTER-4 NETWORK MODEL Techniques that enable complex projects to be schedul...
PDF
ICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph Analysis
PPTX
Multicore architectures
PPTX
층류 익형의 설계 최적화
PDF
Web Traffic Time Series Forecasting
PDF
PDF
Using Graphs for Feature Engineering_ Graph Reduce-2.pdf
PPTX
Fundamental Concept of Parallel Processing
PDF
ICIAM 2019: Reproducible Linear Algebra from Application to Architecture
PDF
Facilitation of further regional disaggregation of the ETSAP-TIAM model
PPT
Mining quasi bicliques using giraph
PDF
Consideration of disruptive elements in energy system models
PPTX
Thesis_presentation1
PPTX
Planning for Smart Operational Solutions
PDF
Towards quantum machine learning calogero zarbo - meet up
PDF
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
PDF
TSR CLASS CD-UNIT 5.pdf sddfsfdsfqweqdew
PDF
Processing large point clouds
PPTX
Smart Data Center Design
CVPR 2018 Paper Reading MobileNet V2
CHAPTER-4 NETWORK MODEL Techniques that enable complex projects to be schedul...
ICIAM 2019: A New Algorithm Model for Massive-Scale Streaming Graph Analysis
Multicore architectures
층류 익형의 설계 최적화
Web Traffic Time Series Forecasting
Using Graphs for Feature Engineering_ Graph Reduce-2.pdf
Fundamental Concept of Parallel Processing
ICIAM 2019: Reproducible Linear Algebra from Application to Architecture
Facilitation of further regional disaggregation of the ETSAP-TIAM model
Mining quasi bicliques using giraph
Consideration of disruptive elements in energy system models
Thesis_presentation1
Planning for Smart Operational Solutions
Towards quantum machine learning calogero zarbo - meet up
NoSQL Database Migration Masterclass - Session 2: The Anatomy of a Migration
TSR CLASS CD-UNIT 5.pdf sddfsfdsfqweqdew
Processing large point clouds
Smart Data Center Design

PowerGraph

  • 1. 22.06.2015 DIMA – TU Berlin 1 Fachgebiet Datenbanksysteme und Informationsmanagement Technische Universität Berlin http://www.dima.tu-berlin.de/ Hot Topics in Information Management PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs Igor Shevchenko Mentor: Sebastian Schelter
  • 2. 22.06.2015 DIMA – TU Berlin 2 Agenda 1. Natural Graphs: Properties and Problems; 2. PowerGraph: Vertex Cut and Vertex Programs; 3. GAS Decomposition; 4. Vertex Cut Partitioning; 5. Delta Caching; 6. Applications and Evaluation; Paper: Gonzalez at al. PowerGraph: Distributed Graph- Parallel Computation on Natural Graphs.
  • 3. 22.06.2015 DIMA – TU Berlin 3 ■ Natural graphs are graphs derived from real-world or natural phenomena; ■ Graphs are big: billions of vertices and edges and rich metadata; Natural graphs have Power-Law Degree Distribution Natural Graphs
  • 4. 22.06.2015 DIMA – TU Berlin 4 Power-Law Degree Distribution (Andrei Broder et al. Graph structure in the web)
  • 5. 22.06.2015 DIMA – TU Berlin 5 ■ We want to analyze natural graphs; ■ Essential for Data Mining and Machine Learning; Goal Identify influential people and information; Identify special nodes and communities; Model complex data dependencies; Target ads and products; Find communities; Flow scheduling;
  • 6. 22.06.2015 DIMA – TU Berlin 6 ■ Existing distributed graph computation systems perform poorly on natural graphs (Gonzalez et al. OSDI ’12); ■ The reason is presence of high degree vertices; Problem High Degree Vertices: Star-like motif
  • 7. 22.06.2015 DIMA – TU Berlin 7 Possible problems with high degree vertices: ■ Limited single-machine resources; ■ Work imbalance; ■ Sequential computation; ■ Communication costs; ■ Graph partitioning; Applicable to: ■ Hadoop; GraphLab; Pregel (Piccolo); Problem Continued
  • 8. 22.06.2015 DIMA – TU Berlin 8 ■ High degree vertices can exceed the memory capacity of a single machine; ■ Store edge meta-data and adjacency information; Problem: Limited Single-Machine Resources
  • 9. 22.06.2015 DIMA – TU Berlin 9 ■ The power-law degree distribution can lead to significant work imbalance and frequency barriers; ■ For ex. with synchronous execution (Pregel): Problem: Work Imbalance
  • 10. 22.06.2015 DIMA – TU Berlin 10 ■ No parallelization of individual vertex-programs; ■ Edges are processed sequentially; ■ Locking does not scale well to high degree vertices (for ex. in GraphLab); Problem: Sequential Computation Sequentially process edges Asynchronous execution requires heavy locking
  • 11. 22.06.2015 DIMA – TU Berlin 11 ■ Generate and send large amount of identical messages (for ex. in Pregel); ■ This results in communication asymmetry; Problem: Communication Costs
  • 12. 22.06.2015 DIMA – TU Berlin 12 ■ Natural graphs are difficult to partition; ■ Pregel and GraphLab use random (hashed) partitioning on natural graphs thus maximizing the network communication; Problem: Graph Partitioning
  • 13. 22.06.2015 DIMA – TU Berlin 13 ■ Natural graphs are difficult to partition; ■ Pregel and GraphLab use random (hashed) partitioning on natural graphs thus maximizing the network communication; Expected edges that are cut Examples: ■ 10 machines: ■ 100 machines: Problem: Graph Partitioning Continued = number of machines 90% of edges cut; 99% of edges cut;
  • 14. 22.06.2015 DIMA – TU Berlin 14 ■ GraphLab and Pregel are not well suited for computations on natural graphs; Reasons: ■ Challenges of high-degree vertices; ■ Low quality partitioning; Solution: ■ PowerGraph new abstraction; In Summary
  • 15. 22.06.2015 DIMA – TU Berlin 15 PowerGraph
  • 16. 22.06.2015 DIMA – TU Berlin 16 Two approaches for partitioning the graph in a distributed environment: ■ Edge Cut; ■ Vertex Cut; Partition Techniques
  • 17. 22.06.2015 DIMA – TU Berlin 17 ■ Used by Pregel and GraphLab abstractions; ■ Evenly assign vertices to machines; Edge Cut
  • 18. 22.06.2015 DIMA – TU Berlin 18 ■ Used by PowerGraph abstraction; ■ Evenly assign edged to machines; Vertex Cut The strong point of the paper 4 edges 4 edges
  • 19. 22.06.2015 DIMA – TU Berlin 19 Think like a Vertex [Malewicz et al. SIGMOD’10] User-defined Vertex-Program: 1. Runs on each vertex; 2. Interactions are constrained by graph structure; Pregel and GraphLab also use this concept, where parallelism is achieved by running multiple vertex programs simultaneously; Vertex Programs
  • 20. 22.06.2015 DIMA – TU Berlin 20 ■ Vertex cut distributes a single vertex-program across several machines; ■ Allows to parallelize high-degree vertices; GAS Decomposition The strong point of the paper
  • 21. 22.06.2015 DIMA – TU Berlin 21 Generalize the vertex-program into three phases: 1. Gather  Accumulate information about neighborhood; 2. Apply  Apply accumulated value to center vertex; 3. Scatter  Update adjacent edges and vertices; GAS Decomposition Gather, Apply and Scatter are user-defined functions; The strong point of the paper
  • 22. 22.06.2015 DIMA – TU Berlin 22 ■ Executed on the edges in parallel; ■ Accumulate information about neighborhood; Gather Phase
  • 23. 22.06.2015 DIMA – TU Berlin 23 ■ Executed on the central vertex; ■ Apply accumulated value to center vertex; Apply Phase
  • 24. 22.06.2015 DIMA – TU Berlin 24 ■ Executed on the neighboring vertices in parallel; ■ Update adjacent edges and vertices; Scatter Phase
  • 25. 22.06.2015 DIMA – TU Berlin 25 ■ Vertex-programs that are written using GAS decomposition will automatically scale to several machines; How does it work? GAS Decomposition
  • 26. 22.06.2015 DIMA – TU Berlin 26 GAS in a Distributed Environment
  • 27. 22.06.2015 DIMA – TU Berlin 27 ■ Case with 2 machines; GAS in a Distributed Environment
  • 28. 22.06.2015 DIMA – TU Berlin 28 ■ Compute partial sums on each machine; Gather Phase
  • 29. 22.06.2015 DIMA – TU Berlin 29 ■ Send partial sum to the master machine; ■ Master machine computes the total sum; Gather Phase
  • 30. 22.06.2015 DIMA – TU Berlin 30 ■ Apply accumulated value to center vertex; ■ Replicate value to the mirrors; Apply Phase
  • 31. 22.06.2015 DIMA – TU Berlin 31 ■ Update adjacent edges and vertices; ■ Initiate neighboring vertex-programs if necessary; Scatter Phase
  • 32. 22.06.2015 DIMA – TU Berlin 32 ■ During the Gather Phase the partial results are combined using commutative and associative user-defined SUM operation; ■ Examples: sum(a, b): return a + b sum(a, b): return union(a, b) sum(a, b): return min(a, b) ■ Also a requirement for Pregel combiners; ■ What if not commutative and associative? SUM Operation
  • 33. 22.06.2015 DIMA – TU Berlin 33 ■ If not commutative and associative sum; ■ Send each edge data to the master machine; ■ Increases communication amount on Gather: Gather Phase: no partial sums
  • 34. 22.06.2015 DIMA – TU Berlin 34 Vertex Cut Partitioning The strong point of the paper
  • 35. 22.06.2015 DIMA – TU Berlin 35 Three distributed approaches for Vertex Cut: ■ Random Edge Placement; ■ Coordinated Greedy Edge Placement; ■ Oblivious Greedy Edge Placemen; Vertex Cut Partitioning Minimize machines spanned by each vertex Minimize communication and storage overhead =
  • 36. 22.06.2015 DIMA – TU Berlin 36 ■ Randomly assign edges to machines; Random Edge Placement
  • 37. 22.06.2015 DIMA – TU Berlin 37 Random Edge Placement ■ Randomly assign edges to machines;
  • 38. 22.06.2015 DIMA – TU Berlin 38 Random Edge Placement ■ Randomly assign edges to machines; ■ Edge data is uniquely assigned to one machine
  • 39. 22.06.2015 DIMA – TU Berlin 39 ■ Only 3 network communication channels; ■ Can predict network communication usage; ■ Significantly less communication comparing to the Edge Cut graph placement; ■ Can improve upon random placement! Communication Overhead
  • 40. 22.06.2015 DIMA – TU Berlin 40 ■ Place edges on machines which already has the vertices in that edge; Greedy Edge Placement
  • 41. 22.06.2015 DIMA – TU Berlin 41 ■ If several choices are possible, assign to the least loaded machine; Greedy Edge Placement
  • 42. 22.06.2015 DIMA – TU Berlin 42 ■ Greedy Edge Placement is de-randomization; ■ Minimizes the number of machines spanned; Coordinated Greedy Edge Placement: ■ Requires coordination to place each edge; ■ Maintains global distributed placement table; ■ Slower but produces higher quality cuts; Oblivious Greedy Edge Placement: ■ Approx. greedy objective without coordination; ■ Faster but produces lower quality cuts; Greedy Edge Placement
  • 43. 22.06.2015 DIMA – TU Berlin 43 ■ Twitter Follower Graph: 41M vertices, 1.4B edges; ■ Oblivious Greedy Edge Placement balances cost (replication factor) and construction time; Vertex Cut Partitioning: Comparison
  • 44. 22.06.2015 DIMA – TU Berlin 44 ■ Greedy Edge Placement improves computation performance; Vertex Cut Partitioning: Comparison
  • 45. 22.06.2015 DIMA – TU Berlin 45 Delta Caching Execution Modes
  • 46. 22.06.2015 DIMA – TU Berlin 46 ■ Vertex-program can be triggered in response to a change only in a few of its neighbors; ■ In response Gather Phase will accumulate information about the all neighborhood; Delta Caching The strong point of the paper
  • 47. 22.06.2015 DIMA – TU Berlin 47 ■ Accelerate the process by caching neighborhood accumulators from previous gather phase; Delta Caching The strong point of the paper
  • 48. 22.06.2015 DIMA – TU Berlin 48 Delta Caching can speed up: ■ Gather Phase; ■ Scatter Phase; Requires Abelian Group; ■ sum (+) ■ inverse (−) Examples: ■ Page Rank – applicable; ■ Graph Coloring – not applicable; Delta Caching Commutative and associative The strong point of the paper
  • 49. 22.06.2015 DIMA – TU Berlin 49 Supports three execution modes: ■ Synchronous: Bulk-Synchronous GAS Phases; ■ Asynchronous: Interleave GAS Phases; ■ Asynchronous Serializable: Prevent neighboring vertices to run simultaneously; Different tradeoffs: ■ Algorithm performance; ■ System performance; ■ Determinism; Execution Modes
  • 50. 22.06.2015 DIMA – TU Berlin 50 Evaluation
  • 51. 22.06.2015 DIMA – TU Berlin 51 PowerGraph on the natural graphs shows: ■ Reduced network communication; ■ Reduced runtime; ■ Reduced storage; On many examples Evaluation PageRank on the Twitter Follower Graph (40M Users, 1.4 Billion Links)
  • 52. 22.06.2015 DIMA – TU Berlin 52 ■ Collaborative Filtering  Alternating Least Squares  Stochastic Gradient Descent  SVD  Non-negative MF ■ Statistical Inference  Loopy Belief Propagation  Max-Product Linear Programs  Gibbs Sampling Applicability ■ Graph Analytics  PageRank  Triangle Counting  Shortest Path  Graph Coloring  K-core Decomposition ■ Computer Vision  Image stitching ■ Language Modeling  LDA
  • 53. 22.06.2015 DIMA – TU Berlin 53 ■ Vertex Cut; ■ GAS Decomposition; ■ Delta Caching; ■ Three modes of execution;  Synchronous;  Asynchronous;  Asynchronous + Serializable; Strong Points of the Paper
  • 54. 22.06.2015 DIMA – TU Berlin 54 ■ “In all cases the system is entirely symmetric with no single coordinating instance or scheduler”; How do they deal with Synchronous execution? Evaluation mess: ■ Evaluated Synchronous execution using PageRank; ■ Evaluated Asynchronous execution using GraphColoring; ■ Evaluated Asynchronous+S execution using GraphColoring; ■ Compared PowerGraph with published results again using PageRank, Triangle Counting but not GraphColoring; ■ Oblivious Greedy Edge Placement is poorly explained; Weak Points of the Paper
  • 55. 22.06.2015 DIMA – TU Berlin 55 ■ Joseph E. Gonzalez, Yucheng Low, Haijie Gu, Danny Bickson, Carlos Guestrin. PowerGraph: Distributed Graph- Parallel Computation on Natural Graphs. 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI 2012); ■ Malewicz, G., Austern, M. H., Bik, A. J., Dehnert, J., Horn, I., Leiser, N., and Czajkowski, G. Pregel: a system for large- scale graph processing. In SIGMOD (2010). ■ Low, Y., Gonzalez, J., Kyrola, A., Bickson, D., Guestrin, C., and Hellerstein, J. M. Distributed GraphLab: A Framework for Machine Learning and Data Mining in the Cloud. in PVLDB (2012). ■ http://graphlab.org References
  • 56. 22.06.2015 DIMA – TU Berlin 56 Questions? 1. Natural Graphs: Properties and Problems; 2. PowerGraph: Vertex Cut and Vertex Programs; 3. GAS Decomposition; 4. Vertex Cut Partitioning; 5. Delta Caching; 6. Applications and Evaluation; Paper: Gonzalez at al. PowerGraph: Distributed Graph- Parallel Computation on Natural Graphs.