SlideShare a Scribd company logo
Graph Convolutional Networks
In Apache Spark
Intelligent Workflow Automati
Scalacon
Emiliano Martínez
November 2021
BBVA Innovation Labs
n
About Me:
Programming in Scala for 10 years
Akka development
Functional domain models with FP libraries Cats, Scalaz
Big Data: Spark, Kafka, Cassandra, NoSql, ...
Machine Learning: Spark ML, Sklearn, Analytics Zoo, Tensorflow, Torch
Currently, I do NLP at BBVA
Deep Learning Models
- It is a machine learning model based on neural networks, that tries to mimic the
structure and the function of the human brain.
- Supervised Machine learning method.
- The goal is to approximate a function that maps an input x to a category by
adjusting the value of the Θ parameters: y = f(x; Θ)
- Automatic speech recognition. A generated wave from a human voice that is broken down into
what are called phonemes. Each phoneme is like a chain link and by analyzing them in sequence,
starting from the first phoneme, the ASR software uses statistical probability analysis to deduce
whole words and then from there, complete sentences.
- Image Recognition. Based on CNN. To automatically identify objects, people, place in images. It is
used in guiding robots, autonomous vehicles, driver assistant systems, etc ...
- Drug Discovery. Using graph convolutional networks.
- Natural Language Processing. Set of machine learning models/techniques to process natural
language data.
Deep Learning Models
Neural Networks
Hidden Layers
y ∊ ℝk
ŷ ∊ ℝk
x ∊ ℝn
Input training with
samples n features
Output label vector
Prediction vector
a1
l1
a1
l2
a4
l2
a4
l1
W + b parameters
Input Layer Output Layer
zl
= Wl
al-1
+ bl
al
= σ(zl
)
Layer Feed Forward Equations
g(z)=1/(1+e^(-z)) g(z)= max(0, z) g(z)= (e^(z) - e^(-z))/(e^(z) + e^(-z))
Sigmoid ReLU Tanh
Hi
=σ(Wi
Hi-1
+ bi
)
CE(𝑃,𝑄)=−𝐸𝑥∼𝑃[log(𝑞(𝑥)]
Training Equations
Layer feed Forward
Loss Function
Gradient Calculation
Weights update
Forward step
Backward step
Graphs
- Graphs are described by a set of vertices and edges.
- Data that can not be represented in an euclidean space.
- Input training samples can be represented as nodes of a graph. They are defined by
its properties and by the connections with other nodes.
- They can be cyclic, acyclic, weighted, ...
G = (V, E)
Graphs Examples
Caffeine. Image taken from
Wikipedia
Social Network Graph . Image taken from Wikimedia
- A type of NN the operates directly on Graph Structures.
- It can be used for tasks of node node classification.
- Different approaches that can be used depending of the case:
a. Inductive: GraphSage, ...
b. Transductive: Spectral graph convolutions, DeepWalk, ...
Graph Neural Networks
Convolutions 0 1 1
0 0 0
1 1 1
1 0 1
0 1 1
1 0 0
1 -1 0
0 -1 0
1 1 1
- To apply filters that detect details of the
images.
- Less parameters than the Fully
Connected Layer Model.
- Pixel positions and neighborhood have
semantic meanings.
- Element-wise multiplication between the
filter-sized patch of the input and filter,
which is then summed, always resulting
in a single value.
- Translation invariance.
Some CNN improvements:
Graph Convolutional Networks
Graph
Convolutional
Networks
Spectral Based
Models
Spacial Based
Models
Classic CNN Models
Propagation
Models
General
Frameworks
Graph
Convolutional
Networks
Computer
Vision
NLP
Science
Images
Videos
Point Clouds
Meshes
Physics
Chemistry
Social
Representation in the
Fourier Domain:
Eigendecomposition of
the Graph Laplacian
Matrix.
GCN Intuition
2
4
5
3
1
0 1 1 0 0
1
1
0
0 1 1 1
1 0 0 0
1 1
0 0
0 1 0 1 0
X = N x F
Convolution 1
NxN Graph Adjacency Matrix
Hi
= f(Hi-1
A)
If First Layer: Hi
= f(XA)
Convolution 2
Dense
Dense
H1
H2
N nodes by F
features per
node
SoftMax
dot(A, X)
dot(A, H1)
Graph Analysis models
- Fast Spectral Graph Convolution. Kipf & Welling (2017)
H(l+1)
= σ[D-1/2
ÂD-1/2
Hl
Wl
]
● Semi-supervised learning method.
● Simplification of Spectral Graph Analysis.
● 2 Layers GCN.
● Two hops neighborhood.
D is the degree matrix
 adjacency matrix plus the identity matrix
Laplacian
Eigendecomposition
Spectral Graph Convolution
(Defferrard et al., 2016).
Truncated expansion in terms of Chebyshev
polynomials
(Defferrard et al., 2016).
First-order approximation of spectral
graph convolutions
(Kipf & Welling, 2016). with K = 1, θ0
= 2, and θ1
= −1
Frameworks used for implementation
- Apache Spark 3.1.2
1. GraphX to get Connected components
2. Spark ML to transform the dataframes corresponding to graphs
3. Spark Core to create the RDD[Sample] and to partition the dataset depending of the graph´s components.
- Breeze 1.0
1. Create the adjacency matrix that represents node connections, sparse matrix.
2. Convert to symmetric matrix
3. Normalize the matrix according to the spectral graph approximation
- Analytics Zoo for Spark 3.1.2
1. Build the Model Graph
2. Model optimization
Deep Learning in Spark
- Analytics Zoo
It provides a distributed deep learning framework using a Scala Keras based implementation that runs on on BigDL framework.
https://arxiv.org/pdf/1804.05839.pdf
BigDL: A Distributed Deep Learning Framework for Big Data
Experiment Steps
Edges
Nodes
Read Files Input Tensor
Dataset
Adjacency Matrix
Get Spark
Workers
Two modules of one
convolutional and one hidden
layer. Adam Optimizer an L2
Regularization
Model
Split Graph
Partition 1
Partition 2
Two modules of one
convolutional and one hidden
layer. Adam Optimizer an L2
Regularization
Model
Graph 1
Graph 2
Graph 1
Graph 2
All-reduce parameters
RDD Partitioner
Sparse Breeze to
Analytics Zoo
sparse Tensor
Case implementation
- Cora dataset
The Cora dataset consists of 2708 scientific publications classified into one of seven classes. The citation network consists of 5429 links. Each
publication in the dataset is described by a 0/1-valued word vector indicating the absence/presence of the corresponding word from the
dictionary. The dictionary consists of 1433 unique words.
Documents can be classified among this seven classes: Neural_Networks, Rule_Learning, Reinforcement_Learning,
Probabilistic_Methods, Theory, Genetic_Algorithms, Case_Based
Two main files: content with the input features and edges. Data is loaded in a
RDD with one partition per graph.
Cora.cites
35 1033
35 103482
35 103515
35 1050679
35 1103960
887 334153
906 910
Cora.content
31336 0 0 0 0 0
0 0 0 0 0
0 0 1 0 0
0 0 0 0 0
0 1 0 0 0
0 0 0 0 0
0 0 0 0 0
… 0 Neural_Networks
case class Element(id: String, words:
Array[Float], label: String)
RDD[Element]
1. One partition per graph.
2. Avoid shuffle
case class Edge(orig: Int, dest: Int)
RDD[Edge]
1. Build adjacency matrix from this representation
private[gcn] def createInputDataset(rdd: RDD[Element]) : RDD[Sample[Float]] = {
rdd.map { case Element(_, words, label) =>
Sample(
Tensor(element.words, Array(1432)),
Tensor(Array(label), Array(1))
)
}
}
Build the datasets using Sample and Tensor[T] types from Analytics Zoo library
Datasets are represented as RDD[Sample]
One partition per graph. Use Spark Partioner in case of multiple graphs.
Use GraphX to split the graph in different components.
1. Adjacency matrix is built and processed using Breeze CSCMatrix.
val builder = new CSCMatrix.Builder[Float](rows, cols)
edges.foreach { case Edge(r, c) =>
builder.add(r, c, 1.0F)
}
2. Transform to Symmetrical.
sparseAdj +:+ (sparseAdj.t *:* (sparseAdj.t >:> sparseAdj)
.map(el => if (el) 1.0F else 0.0F)) - (sparseAdj *:* (sparseAdj.t >:> sparseAdj)
.map(el => if (el) 1.0F else 0.0F))
3. Matrix normalization.
According to the spectral graph convolution equation.
def getModel(
dropout: Double,
matrix: Tensor[Float],
batchSize: Int,
inputSize: Int,
intermediateSize: Int,
labelsNumber: Int
): Sequential[Float] = {
Sequential[Float]()
.add(Dropout(dropout))
.add(GraphConvolution[Float](matrix, batchSize,
inputSize))
.add(Linear[Float](inputSize, intermediateSize,
wRegularizer =
L2Regularizer(5e-4)).setName("layer-1"))
.add(ReLU())
.add(Dropout(dropout))
.add(GraphConvolution[Float](matrix, batchSize,
intermediateSize))
.add(Linear[Float](intermediateSize,
labelsNumber).setName("layer-2"))
.add(LogSoftMax())
}
NN Model
GraphConvLayer LinearLayer
ReLU
Drop
GraphConvLayer
Drop
LinearLayer SoftMax Label
Prediction
Input
Trainable parameters are in red modules!
Model sequential implementation:
Optimization Process
- We use only train with 140 samples of the 2708.
- Every mini-batch is equivalent to one Epoch.
- Avoid shuffle the data in data broadcast.
- For every sub-graph one Spark Partition.
- The negative log likelihood (NLL) criterion.
- Adam Optimizer with lr = 1E-3, beta1= 0.9, beta2 = 0.999, epsilon =
1E-8, decay = 0, wdecay = 0
Results
accuracy: 0.531019
- Training 1000 Epochs.
- 140 labeled examples.
- Propagation Function HW. Multilayer perceptron: D-1/2
ÂD-1/2
HW
Case 1: One graph in one partition.
- Propagation Function D-1/2
ÂD-1/2
HW. Renormalization trick.
accuracy: 0.769202
Identity matrix!
Process Visualization
- Represent the output of the second hidden layer (7 neurons)
- Dimensionality reduction applying tSNE(t-Distributed Stochastic
Neighbor Embedding)
- A Snapshot every 200 epochs is taken.
Representation of the last layer using tSNE with NO convolution
Representation of the last layer using tSNE with spectral Conv
Conclusions and future work
- Convolutions on graphs show promising results in graph analysis using deep
learning.
- We can get benefit of the Spark processing power to perform distributed NN
training.
- The Scala ecosystem will help to develop and integrate with the big data world.
- Scala 3 Graph Neural Network library on top of Spark.
Implementation:
https://github.com/emartinezs44/SparkGCN

More Related Content

PDF
Vector Search for Data Scientists.pdf
PPTX
Introduction to Apache Spark
PDF
Apache spark - Architecture , Overview & libraries
PDF
Introducing DataFrames in Spark for Large Scale Data Science
PDF
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...
PDF
Apache Spark Overview
PDF
Introduction to Apache Spark
PDF
Cassandra at eBay - Cassandra Summit 2012
Vector Search for Data Scientists.pdf
Introduction to Apache Spark
Apache spark - Architecture , Overview & libraries
Introducing DataFrames in Spark for Large Scale Data Science
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...
Apache Spark Overview
Introduction to Apache Spark
Cassandra at eBay - Cassandra Summit 2012

What's hot (20)

PPTX
Airflow - a data flow engine
PDF
What is Apache Spark | Apache Spark Tutorial For Beginners | Apache Spark Tra...
PDF
Spark 의 핵심은 무엇인가? RDD! (RDD paper review)
PDF
Introduction to DataFusion An Embeddable Query Engine Written in Rust
PDF
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...
PPT
Graph database
PDF
Apache Spark - Basics of RDD | Big Data Hadoop Spark Tutorial | CloudxLab
PPTX
Real Time search using Spark and Elasticsearch
PDF
Improving Machine Learning using Graph Algorithms
PDF
How I learned to time travel, or, data pipelining and scheduling with Airflow
PDF
Introduction to PySpark
PPTX
Programming in Spark using PySpark
PPTX
Flink Streaming
PDF
Deep Learning for Graphs
PDF
Spark graphx
PDF
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
PDF
Introduction to Apache Spark
PDF
Webinar on Graph Neural Networks
PPTX
Bigquery와 airflow를 이용한 데이터 분석 시스템 구축 v1 나무기술(주) 최유석 20170912
PDF
Introduction to Spark Streaming
Airflow - a data flow engine
What is Apache Spark | Apache Spark Tutorial For Beginners | Apache Spark Tra...
Spark 의 핵심은 무엇인가? RDD! (RDD paper review)
Introduction to DataFusion An Embeddable Query Engine Written in Rust
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...
Graph database
Apache Spark - Basics of RDD | Big Data Hadoop Spark Tutorial | CloudxLab
Real Time search using Spark and Elasticsearch
Improving Machine Learning using Graph Algorithms
How I learned to time travel, or, data pipelining and scheduling with Airflow
Introduction to PySpark
Programming in Spark using PySpark
Flink Streaming
Deep Learning for Graphs
Spark graphx
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Introduction to Apache Spark
Webinar on Graph Neural Networks
Bigquery와 airflow를 이용한 데이터 분석 시스템 구축 v1 나무기술(주) 최유석 20170912
Introduction to Spark Streaming
Ad

Similar to Graph convolutional networks in apache spark (20)

PDF
Graph Neural Network in practice
PDF
Learning Convolutional Neural Networks for Graphs
PPTX
Chapter 4 better.pptx
PDF
Grl book
PDF
Gnn overview
PPTX
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
PDF
Training Graph Convolutional Neural Networks in Graph Database
PPTX
20191107 deeplearningapproachesfornetworks
PDF
Deep learning 1.0 and Beyond, Part 1
PDF
Graph neural networks overview
PDF
From RNN to neural networks for cyclic undirected graphs
PDF
Talk Norway Aug2016
PDF
Memory Efficient Graph Convolutional Network based Distributed Link Prediction
PDF
Graph Gurus Episode 19: Deep Learning Implemented by GSQL on a Native Paralle...
PPTX
240729_JW_labseminar[Semi-Supervised Classification with Graph Convolutional ...
PDF
Graph Neural Networks.pdf
PDF
Deep learning for molecules, introduction to chainer chemistry
PDF
Knowledge graphs, meet Deep Learning
PPTX
Towards Predicting Molecular Property by Graph Neural Networks
PDF
Semi-Supervised Classification with Graph Convolutional Networks @ICLR2017読み会
Graph Neural Network in practice
Learning Convolutional Neural Networks for Graphs
Chapter 4 better.pptx
Grl book
Gnn overview
Introduction to Graph Neural Networks: Basics and Applications - Katsuhiko Is...
Training Graph Convolutional Neural Networks in Graph Database
20191107 deeplearningapproachesfornetworks
Deep learning 1.0 and Beyond, Part 1
Graph neural networks overview
From RNN to neural networks for cyclic undirected graphs
Talk Norway Aug2016
Memory Efficient Graph Convolutional Network based Distributed Link Prediction
Graph Gurus Episode 19: Deep Learning Implemented by GSQL on a Native Paralle...
240729_JW_labseminar[Semi-Supervised Classification with Graph Convolutional ...
Graph Neural Networks.pdf
Deep learning for molecules, introduction to chainer chemistry
Knowledge graphs, meet Deep Learning
Towards Predicting Molecular Property by Graph Neural Networks
Semi-Supervised Classification with Graph Convolutional Networks @ICLR2017読み会
Ad

Recently uploaded (20)

PPTX
microscope-Lecturecjchchchchcuvuvhc.pptx
PPTX
INTRODUCTION TO EVS | Concept of sustainability
PDF
Formation of Supersonic Turbulence in the Primordial Star-forming Cloud
PPT
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
PDF
bbec55_b34400a7914c42429908233dbd381773.pdf
PDF
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
PPTX
Introduction to Cardiovascular system_structure and functions-1
PDF
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
PPTX
Introduction to Fisheries Biotechnology_Lesson 1.pptx
PPTX
TOTAL hIP ARTHROPLASTY Presentation.pptx
PPTX
famous lake in india and its disturibution and importance
PPTX
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
PDF
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
PDF
IFIT3 RNA-binding activity primores influenza A viruz infection and translati...
PPTX
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
DOCX
Viruses (History, structure and composition, classification, Bacteriophage Re...
PDF
Biophysics 2.pdffffffffffffffffffffffffff
PPTX
GEN. BIO 1 - CELL TYPES & CELL MODIFICATIONS
PDF
. Radiology Case Scenariosssssssssssssss
PPTX
Microbiology with diagram medical studies .pptx
microscope-Lecturecjchchchchcuvuvhc.pptx
INTRODUCTION TO EVS | Concept of sustainability
Formation of Supersonic Turbulence in the Primordial Star-forming Cloud
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
bbec55_b34400a7914c42429908233dbd381773.pdf
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
Introduction to Cardiovascular system_structure and functions-1
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
Introduction to Fisheries Biotechnology_Lesson 1.pptx
TOTAL hIP ARTHROPLASTY Presentation.pptx
famous lake in india and its disturibution and importance
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
IFIT3 RNA-binding activity primores influenza A viruz infection and translati...
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
Viruses (History, structure and composition, classification, Bacteriophage Re...
Biophysics 2.pdffffffffffffffffffffffffff
GEN. BIO 1 - CELL TYPES & CELL MODIFICATIONS
. Radiology Case Scenariosssssssssssssss
Microbiology with diagram medical studies .pptx

Graph convolutional networks in apache spark

  • 1. Graph Convolutional Networks In Apache Spark Intelligent Workflow Automati Scalacon Emiliano Martínez November 2021 BBVA Innovation Labs n
  • 2. About Me: Programming in Scala for 10 years Akka development Functional domain models with FP libraries Cats, Scalaz Big Data: Spark, Kafka, Cassandra, NoSql, ... Machine Learning: Spark ML, Sklearn, Analytics Zoo, Tensorflow, Torch Currently, I do NLP at BBVA
  • 3. Deep Learning Models - It is a machine learning model based on neural networks, that tries to mimic the structure and the function of the human brain. - Supervised Machine learning method. - The goal is to approximate a function that maps an input x to a category by adjusting the value of the Θ parameters: y = f(x; Θ)
  • 4. - Automatic speech recognition. A generated wave from a human voice that is broken down into what are called phonemes. Each phoneme is like a chain link and by analyzing them in sequence, starting from the first phoneme, the ASR software uses statistical probability analysis to deduce whole words and then from there, complete sentences. - Image Recognition. Based on CNN. To automatically identify objects, people, place in images. It is used in guiding robots, autonomous vehicles, driver assistant systems, etc ... - Drug Discovery. Using graph convolutional networks. - Natural Language Processing. Set of machine learning models/techniques to process natural language data. Deep Learning Models
  • 5. Neural Networks Hidden Layers y ∊ ℝk ŷ ∊ ℝk x ∊ ℝn Input training with samples n features Output label vector Prediction vector a1 l1 a1 l2 a4 l2 a4 l1 W + b parameters Input Layer Output Layer
  • 6. zl = Wl al-1 + bl al = σ(zl ) Layer Feed Forward Equations g(z)=1/(1+e^(-z)) g(z)= max(0, z) g(z)= (e^(z) - e^(-z))/(e^(z) + e^(-z)) Sigmoid ReLU Tanh
  • 7. Hi =σ(Wi Hi-1 + bi ) CE(𝑃,𝑄)=−𝐸𝑥∼𝑃[log(𝑞(𝑥)] Training Equations Layer feed Forward Loss Function Gradient Calculation Weights update Forward step Backward step
  • 8. Graphs - Graphs are described by a set of vertices and edges. - Data that can not be represented in an euclidean space. - Input training samples can be represented as nodes of a graph. They are defined by its properties and by the connections with other nodes. - They can be cyclic, acyclic, weighted, ... G = (V, E)
  • 9. Graphs Examples Caffeine. Image taken from Wikipedia Social Network Graph . Image taken from Wikimedia
  • 10. - A type of NN the operates directly on Graph Structures. - It can be used for tasks of node node classification. - Different approaches that can be used depending of the case: a. Inductive: GraphSage, ... b. Transductive: Spectral graph convolutions, DeepWalk, ... Graph Neural Networks
  • 11. Convolutions 0 1 1 0 0 0 1 1 1 1 0 1 0 1 1 1 0 0 1 -1 0 0 -1 0 1 1 1 - To apply filters that detect details of the images. - Less parameters than the Fully Connected Layer Model. - Pixel positions and neighborhood have semantic meanings. - Element-wise multiplication between the filter-sized patch of the input and filter, which is then summed, always resulting in a single value. - Translation invariance. Some CNN improvements:
  • 12. Graph Convolutional Networks Graph Convolutional Networks Spectral Based Models Spacial Based Models Classic CNN Models Propagation Models General Frameworks Graph Convolutional Networks Computer Vision NLP Science Images Videos Point Clouds Meshes Physics Chemistry Social Representation in the Fourier Domain: Eigendecomposition of the Graph Laplacian Matrix.
  • 13. GCN Intuition 2 4 5 3 1 0 1 1 0 0 1 1 0 0 1 1 1 1 0 0 0 1 1 0 0 0 1 0 1 0 X = N x F Convolution 1 NxN Graph Adjacency Matrix Hi = f(Hi-1 A) If First Layer: Hi = f(XA) Convolution 2 Dense Dense H1 H2 N nodes by F features per node SoftMax dot(A, X) dot(A, H1)
  • 14. Graph Analysis models - Fast Spectral Graph Convolution. Kipf & Welling (2017) H(l+1) = σ[D-1/2 ÂD-1/2 Hl Wl ] ● Semi-supervised learning method. ● Simplification of Spectral Graph Analysis. ● 2 Layers GCN. ● Two hops neighborhood. D is the degree matrix  adjacency matrix plus the identity matrix
  • 15. Laplacian Eigendecomposition Spectral Graph Convolution (Defferrard et al., 2016). Truncated expansion in terms of Chebyshev polynomials (Defferrard et al., 2016). First-order approximation of spectral graph convolutions (Kipf & Welling, 2016). with K = 1, θ0 = 2, and θ1 = −1
  • 16. Frameworks used for implementation - Apache Spark 3.1.2 1. GraphX to get Connected components 2. Spark ML to transform the dataframes corresponding to graphs 3. Spark Core to create the RDD[Sample] and to partition the dataset depending of the graph´s components. - Breeze 1.0 1. Create the adjacency matrix that represents node connections, sparse matrix. 2. Convert to symmetric matrix 3. Normalize the matrix according to the spectral graph approximation - Analytics Zoo for Spark 3.1.2 1. Build the Model Graph 2. Model optimization
  • 17. Deep Learning in Spark - Analytics Zoo It provides a distributed deep learning framework using a Scala Keras based implementation that runs on on BigDL framework. https://arxiv.org/pdf/1804.05839.pdf BigDL: A Distributed Deep Learning Framework for Big Data
  • 18. Experiment Steps Edges Nodes Read Files Input Tensor Dataset Adjacency Matrix Get Spark Workers Two modules of one convolutional and one hidden layer. Adam Optimizer an L2 Regularization Model Split Graph Partition 1 Partition 2 Two modules of one convolutional and one hidden layer. Adam Optimizer an L2 Regularization Model Graph 1 Graph 2 Graph 1 Graph 2 All-reduce parameters RDD Partitioner Sparse Breeze to Analytics Zoo sparse Tensor
  • 19. Case implementation - Cora dataset The Cora dataset consists of 2708 scientific publications classified into one of seven classes. The citation network consists of 5429 links. Each publication in the dataset is described by a 0/1-valued word vector indicating the absence/presence of the corresponding word from the dictionary. The dictionary consists of 1433 unique words. Documents can be classified among this seven classes: Neural_Networks, Rule_Learning, Reinforcement_Learning, Probabilistic_Methods, Theory, Genetic_Algorithms, Case_Based
  • 20. Two main files: content with the input features and edges. Data is loaded in a RDD with one partition per graph. Cora.cites 35 1033 35 103482 35 103515 35 1050679 35 1103960 887 334153 906 910 Cora.content 31336 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 … 0 Neural_Networks case class Element(id: String, words: Array[Float], label: String) RDD[Element] 1. One partition per graph. 2. Avoid shuffle case class Edge(orig: Int, dest: Int) RDD[Edge] 1. Build adjacency matrix from this representation
  • 21. private[gcn] def createInputDataset(rdd: RDD[Element]) : RDD[Sample[Float]] = { rdd.map { case Element(_, words, label) => Sample( Tensor(element.words, Array(1432)), Tensor(Array(label), Array(1)) ) } } Build the datasets using Sample and Tensor[T] types from Analytics Zoo library Datasets are represented as RDD[Sample] One partition per graph. Use Spark Partioner in case of multiple graphs. Use GraphX to split the graph in different components.
  • 22. 1. Adjacency matrix is built and processed using Breeze CSCMatrix. val builder = new CSCMatrix.Builder[Float](rows, cols) edges.foreach { case Edge(r, c) => builder.add(r, c, 1.0F) } 2. Transform to Symmetrical. sparseAdj +:+ (sparseAdj.t *:* (sparseAdj.t >:> sparseAdj) .map(el => if (el) 1.0F else 0.0F)) - (sparseAdj *:* (sparseAdj.t >:> sparseAdj) .map(el => if (el) 1.0F else 0.0F)) 3. Matrix normalization. According to the spectral graph convolution equation.
  • 23. def getModel( dropout: Double, matrix: Tensor[Float], batchSize: Int, inputSize: Int, intermediateSize: Int, labelsNumber: Int ): Sequential[Float] = { Sequential[Float]() .add(Dropout(dropout)) .add(GraphConvolution[Float](matrix, batchSize, inputSize)) .add(Linear[Float](inputSize, intermediateSize, wRegularizer = L2Regularizer(5e-4)).setName("layer-1")) .add(ReLU()) .add(Dropout(dropout)) .add(GraphConvolution[Float](matrix, batchSize, intermediateSize)) .add(Linear[Float](intermediateSize, labelsNumber).setName("layer-2")) .add(LogSoftMax()) } NN Model GraphConvLayer LinearLayer ReLU Drop GraphConvLayer Drop LinearLayer SoftMax Label Prediction Input Trainable parameters are in red modules! Model sequential implementation:
  • 24. Optimization Process - We use only train with 140 samples of the 2708. - Every mini-batch is equivalent to one Epoch. - Avoid shuffle the data in data broadcast. - For every sub-graph one Spark Partition. - The negative log likelihood (NLL) criterion. - Adam Optimizer with lr = 1E-3, beta1= 0.9, beta2 = 0.999, epsilon = 1E-8, decay = 0, wdecay = 0
  • 25. Results accuracy: 0.531019 - Training 1000 Epochs. - 140 labeled examples. - Propagation Function HW. Multilayer perceptron: D-1/2 ÂD-1/2 HW Case 1: One graph in one partition. - Propagation Function D-1/2 ÂD-1/2 HW. Renormalization trick. accuracy: 0.769202 Identity matrix!
  • 26. Process Visualization - Represent the output of the second hidden layer (7 neurons) - Dimensionality reduction applying tSNE(t-Distributed Stochastic Neighbor Embedding) - A Snapshot every 200 epochs is taken.
  • 27. Representation of the last layer using tSNE with NO convolution
  • 28. Representation of the last layer using tSNE with spectral Conv
  • 29. Conclusions and future work - Convolutions on graphs show promising results in graph analysis using deep learning. - We can get benefit of the Spark processing power to perform distributed NN training. - The Scala ecosystem will help to develop and integrate with the big data world. - Scala 3 Graph Neural Network library on top of Spark.