SlideShare a Scribd company logo
Recommendation Engine with
In-Database Machine Learning
Changran Liu, Mingxi Wu
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
Outline
● In-database model training
● Latent factor recommendation
model
● Distributed model training in Graph
● Demo
● GSQL implementation
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
Traditional Model Training Pipeline
training data
model
request
results
Database:
● data storage
● data update
● preprocess data
Machine learning platform
● model training
● model validation
Applications:
● place order
● recommendation
● ...
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
In-situ machine learning in database:
● No need for exporting data
● Better support continuous model training over evolving data
● Less limitation on model size
● Support distributed model training
In-Database Model Training
request
results
Applications:
● recommendation
● fraud detection
● ...
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
Movie Recommendation
movie features
users ratings
Goals:
● Predict users' ratings for movies they haven't
seen, based on previous ratings
● Recommend movies to users based on rating
prediction
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
Movie Rating Prediction (Latent factors model)
Movie Alice Bob Carol Dave
Love at last 5 5 0 0
Romance forever 5 ? ? 0
Cute puppies of love ? 4 0 ?
Toy story ? ? ? 5
Sword vs. karate 0 0 5 ?
Nonstop car chases 0 0 5 4
● Each movie has a latent
factor vector: θ(j)
● Each user has a latent
factor vector: x(i)
● Predict the user j’s rating
to movie i by: (θ(j)
)T
x(i)
θ(1)
= [5, 0] θ(2)
= [5, 0] θ(3)
= [0, 5] θ(4)
= [0, 5]
x(1)
= [0.9, 0]
x(2)
= [1, 0.1]
x(3)
= [0.9, 0]
x(4)
= [0.1, 1]
x(5)
= [0.1, 1]
x(6)
= [0, 0.9]
4.5
5
4.5
0.5
0.5
0
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
Movie Rating Prediction (Latent factors model)
Movie Alice Bob Carol Dave
Love at last 5 5 0 0
Romance forever 5 ? ? 0
Cute puppies of love ? 4 0 ?
Toy story ? ? ? 5
Sword vs. karate 0 0 5 ?
Nonstop car chases 0 0 5 4
θ(1)
= [5, 0]
● Each movie has a latent
factor vector: θ(j)
● Each user has a latent
factor vector: x(i)
● Predict the user j’s rating
to movie i by: (θ(j)
)T
x(i)
θ(2)
= [5, 0] θ(3)
= [0, 5] θ(4)
= [0, 5]
x(1)
= [0.9, 0]
x(2)
= [1, 0.1]
x(3)
= [0.9, 0]
x(4)
= [0.1, 1]
x(5)
= [0.1, 1]
x(6)
= [0, 0.9]
action
romance
4.5
5
4.5
0.5
0.5
0
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
Cost Function
RMSE regularization
User 2
Movie 1
Movie 2
Movie 3
User 1
rating: y
(1,1)
rating: y (1,2)
rating: y
(2,2)
rating: y (2,3)
θ(1)
θ(2)
x(1)
x(2)
x(3)
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
Model Training in Graph
User 2
Movie 1
Movie 2
Movie 3
User 1
rating: y(1,1)
rating: y (1,2)
rating: y
(2,2)
rating: y(2,3)
θ(1)
θ(2)
x(1)
x(2)
x(3)
Phase 1:
● Collect x(i)
, y(i,j)
from the movies that each user
rated
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
Model Training in Graph
User 2
Movie 1
Movie 2
Movie 3
User 1
rating: y(1,1)
rating: y (1,2)
rating: y
(2,2)
rating: y(2,3)
[θ(1)
, x(1)
, y(1,1)
]
[θ(1)
, x(2)
, y(1,2)
]
x(1)
x(2)
x(3)
[θ(2)
, x(2)
, y(2,2)
]
[θ(2)
, x(3)
, y(2,3)
]
Phase 1:
● Collect x(i)
, y(i,j)
from the movies that each user
rated
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
Model Training in Graph
User 2
Movie 1
Movie 2
Movie 3
User 1
rating: y(1,1)
rating: y (1,2)
rating: y
(2,2)
rating: y(2,3)
[θ(1)
, x(1)
, y(1,1)
] →( (θ(1)
)T
x(1)
-y(1,1)
) x(1)
[θ(1)
, x(2)
, y(1,2)
] →( (θ(1)
)T
x(2)
-y(1,1)
) x(2)
x(1)
x(2)
x(3)
[θ(2)
, x(2)
, y(2,2)
] → ((θ(2)
)T
x(2)
-y(2,2)
) x(2)
[θ(2)
, x(3)
, y(2,3)
] → ((θ(2)
)T
x(3)
-y(2,3)
) x(3)
Phase 1:
● Collect x(i)
, y(i,j)
from the movies that each user
rated
● Compute the gradient contributed by each
movies
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
Model Training in Graph
User 2
Movie 1
Movie 2
Movie 3
User 1
rating: y(1,1)
rating: y (1,2)
rating: y
(2,2)
rating: y(2,3)
( (θ(1)
)T
x(1)
-y(1,1)
) x(1)
+ ( (θ(1)
)T
x(2)
-y(1,1)
) x(2)
x(1)
x(2)
x(3)
((θ(2)
)T
x(2)
-y(2,2)
) x(2)
+ ((θ(2)
)T
x(3)
-y(2,3)
) x(3)
Phase 1:
● Collect x(i)
, y(i,j)
from the movies that each user
rated
● Compute the gradient contributed by each
movies
Phase 2:
● Aggregate the gradient
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
Model Training in Graph
User 2
Movie 1
Movie 2
Movie 3
User 1
rating: y(1,1)
rating: y (1,2)
rating: y
(2,2)
rating: y(2,3)
x(1)
x(2)
x(3)
Phase 1:
● Collect x(i)
, y(i,j)
from the movies that each user
rated
● Compute the gradient contributed by each
movies
Phase 2:
● Aggregate the gradient
● Update the feature vector using gradient descent
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
Training
Split data
Initialize latent factor
vectors
diff. between prediction and
label
converged?
no
finish
yes
update latent vectors
using gradient descent
(splitData.gsql)
(initialization.gsql)
(training.gsql)
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
Demo
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
GSQL Training Block
USERs = SELECT s FROM USERs:s -(rate:e)-> MOVIE:t
ACCUM
DOUBLE prediction = dotProduct(s.@theta,t.@x),
DOUBLE delta = prediction-e.rating,
s.@Gradient += product(t.@x,delta),
t.@Gradient += product(s.@theta,delta)
POST-ACCUM
s.@theta += product(s.@Gradient,-alpha),
t.@x += product(t.@Gradient,-alpha);
Dave
Romance
forever
Love at
last
Nonstop
car chases
Alice
rating: 5
rating: 5
rating: 0
rating: 4
θ = [1.5, 1.7]
θ = [1.0, 1.5]
x = [2.0, 2.3]
x = [2.0, 1.3]
x = [1.0, 1.3]
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
GSQL Training Block
USERs = SELECT s FROM USERs:s -(rate:e)-> MOVIE:t
ACCUM
DOUBLE prediction = dotProduct(s.@theta,t.@x),
DOUBLE delta = prediction-e.rating,
s.@Gradient += product(t.@x,delta),
t.@Gradient += product(s.@theta,delta)
POST-ACCUM
s.@theta += product(s.@Gradient,-alpha),
t.@x += product(t.@Gradient,-alpha);
Dave
Romance
forever
Love at
last
Nonstop
car chases
Alice
rating: 5
rating: 5
rating: 0
rating: 4
θ = [1.5, 1.7]
θ = [1.0, 1.5]
x = [2.0, 2.3]
x = [2.0, 1.3]
x = [1.0, 1.3]
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
GSQL Training Block
USERs = SELECT s FROM USERs:s -(rate:e)-> MOVIE:t
ACCUM
DOUBLE prediction = dotProduct(s.@theta,t.@x),
DOUBLE delta = prediction-e.rating,
s.@Gradient += product(t.@x,delta),
t.@Gradient += product(s.@theta,delta)
POST-ACCUM
s.@theta += product(s.@Gradient,-alpha),
t.@x += product(t.@Gradient,-alpha);
Dave
Romance
forever
Love at
last
Nonstop
car chases
Alice
rating: 5
rating: 5
rating: 0
rating: 4
θ = [1.5, 1.7]
θ = [1.0, 1.5]
x = [2.0, 2.3]
x = [2.0, 1.3]
x = [1.0, 1.3]
prediction: 6.9
prediction: 5.2
prediction: 4.0
prediction: 3.0
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
GSQL Training Block
USERs = SELECT s FROM USERs:s -(rate:e)-> MOVIE:t
ACCUM
DOUBLE prediction = dotProduct(s.@theta,t.@x),
DOUBLE delta = prediction-e.rating,
s.@Gradient += product(t.@x,delta),
t.@Gradient += product(s.@theta,delta)
POST-ACCUM
s.@theta += product(s.@Gradient,-alpha),
t.@x += product(t.@Gradient,-alpha);
Dave
Romance
forever
Love at
last
Nonstop
car chases
Alice
ẟ: 1.9
ẟ: 0.2
ẟ: 4.0
ẟ: -1.1
θ = [1.5, 1.7]
θ = [1.0, 1.5]
x = [2.0, 2.3]
x = [2.0, 1.3]
x = [1.0, 1.3]
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
GSQL Training Block
USERs = SELECT s FROM USERs:s -(rate:e)-> MOVIE:t
ACCUM
DOUBLE prediction = dotProduct(s.@theta,t.@x),
DOUBLE delta = prediction-e.rating,
s.@Gradient += product(t.@x,delta),
t.@Gradient += product(s.@theta,delta)
POST-ACCUM
s.@theta += product(s.@Gradient,-alpha),
t.@x += product(t.@Gradient,-alpha);
Dave
Romance
forever
Love at
last
Nonstop
car chases
Alice
ẟ: 1.9
ẟ: 0.2
ẟ: 4.0
ẟ: -1.1
θ = [1.5, 1.7]
grad(θ) = [4.2, 4.7]
θ = [1.0, 1.5]
grad(θ) = [6.9, 3.8]
x = [2.0, 2.3]
x = [2.0, 1.3]
x = [1.0, 1.3]
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
GSQL Training Block
USERs = SELECT s FROM USERs:s -(rate:e)-> MOVIE:t
ACCUM
DOUBLE prediction = dotProduct(s.@theta,t.@x),
DOUBLE delta = prediction-e.rating,
s.@Gradient += product(t.@x,delta),
t.@Gradient += product(s.@theta,delta)
POST-ACCUM
s.@theta += product(s.@Gradient,-alpha),
t.@x += product(t.@Gradient,-alpha);
Dave
Romance
forever
Love at
last
Nonstop
car chases
Alice
ẟ: 1.9
ẟ: 0.2
ẟ: 4.0
ẟ: -1.1
θ = [1.5, 1.7]
grad(θ) = [4.2, 4.7]
θ = [1.0, 1.5]
grad(θ) = [6.9, 3.8]
x = [2.0, 2.3]
grad(x) = [2.9, 3.2]
x = [2.0, 1.3]
grad(x) = [4.3, 6.3]
x = [1.0, 1.3]
grad(x) = [-1.1, -1.6]
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
GSQL Training Block
USERs = SELECT s FROM USERs:s -(rate:e)-> MOVIE:t
ACCUM
DOUBLE prediction = dotProduct(s.@theta,t.@x),
DOUBLE delta = prediction-e.rating,
s.@Gradient += product(t.@x,delta),
t.@Gradient += product(s.@theta,delta)
POST-ACCUM
s.@theta += product(s.@Gradient,-alpha),
t.@x += product(t.@Gradient,-alpha);
Dave
Romance
forever
Love at
last
Nonstop
car chases
Alice
ẟ: 1.9
ẟ: 0.2
ẟ: 4.0
ẟ: -1.1
θ = [1.5, 1.7]
θ’ = [1.46, 1.65]
θ = [1.0, 1.5]
θ’ = [0.93, 1.46]
x = [2.0, 2.3]
x’ = [1.97, 2.27]
x = [2.0, 1.3]
x’ = [1.96, 1.24]
x = [1.0, 1.3]
x’ = [1.01, 1.32]
* alpha = 0.01
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
GSQL Training Block
USERs = SELECT s FROM USERs:s -(rate:e)-> MOVIE:t
ACCUM
DOUBLE prediction = dotProduct(s.@theta,t.@x),
DOUBLE delta = prediction-e.rating,
s.@Gradient += product(t.@x,delta),
t.@Gradient += product(s.@theta,delta)
POST-ACCUM
s.@theta += product(s.@Gradient,-alpha),
t.@x += product(t.@Gradient,-alpha);
Dave
Romance
forever
Love at
last
Nonstop
car chases
Alice
ẟ: 1.9
ẟ’: 1.6
ẟ: 0.2
ẟ’: -0.1
ẟ: 4.0
ẟ’: 3.6
ẟ: -1.1
ẟ’: -1.1
θ = [1.5, 1.7]
θ’ = [1.46, 1.65]
θ = [1.0, 1.5]
θ’ = [0.93, 1.46]
x = [2.0, 2.3]
x’ = [1.97, 2.27]
x = [2.0, 1.3]
x’ = [1.96, 1.24]
x = [1.0, 1.3]
x’ = [1.01, 1.32]
Q&A
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
User-Rate-Movie Graph
● Content based method
Toy story
● Disney
● ...
Iron man
● Marvel
● Action
● ...
Alice
● Disney fan
● Marvel fan
● ...
Bob
● Marvel fan
● ...
rating: 5
rating: 5
rating:4.5
rating:?
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
User-Rate-Movie Graph
● Content based method
Toy story
● Disney
● ...
Iron man
● Marvel
● Action
● ...
Alice
● Disney fan
● Marvel fan
● ...
Bob
● Marvel fan
● ...
rating: 5
rating: 5
rating:4.5
rating:?
● K-nearest neighbors
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
User-Rate-Movie Graph
● Content based method
● K-nearest neighbors
Toy story
● Disney
● ...
Iron man
● Marvel
● Action
● ...
Alice
● Disney fan
● Marvel fan
● ...
Bob
● Marvel fan
● ...
rating: 5
rating: 5
rating:?
● Latent factor (model-based)
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
User-Rate-Movie Graph
● Content based method
● K-nearest neighbors
● Latent factor (model-based)
● Hybrid method
● ...
Toy story
● Disney
● ...
Iron man
● Marvel
● Action
● ...
Alice
● Disney fan
● Marvel fan
● ...
Bob
● Marvel fan
● ...
rating: 5
rating: 5
rating:?
| GRAPHAIWORLD.COM | #GRAPHAIWORLD |
User-Rate-Movie Graph
● Content based method
● K-nearest neighbors
● Latent factor (model-based)
● Hybrid method
● ...
Toy story
● Disney
● ...
Iron man
● Marvel
● Action
● ...
Alice
● Disney fan
● Marvel fan
● ...
Bob
● Marvel fan
● ...
rating: 5
rating: 5
rating:?

More Related Content

PDF
Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)
PDF
Image-to-Image Translation
PDF
Finding connections among images using CycleGAN
PDF
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
PDF
그림 그리는 AI
PDF
Unsupervised learning represenation with DCGAN
PDF
Generative adversarial networks
PPTX
Generative Adversarial Networks (GAN)
Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)
Image-to-Image Translation
Finding connections among images using CycleGAN
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
그림 그리는 AI
Unsupervised learning represenation with DCGAN
Generative adversarial networks
Generative Adversarial Networks (GAN)

What's hot (20)

PPTX
Machine Learning with R
PPTX
Adversarial learning for neural dialogue generation
PDF
DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...
PDF
Generative Adversarial Network (+Laplacian Pyramid GAN)
PPTX
Clean, Learn and Visualise data with R
PDF
[GAN by Hung-yi Lee]Part 1: General introduction of GAN
PDF
Generative Adversarial Networks
PDF
Introduction to Generative Adversarial Networks
PDF
Basic Generative Adversarial Networks
PDF
Generative adversarial networks
PDF
Generative adversarial text to image synthesis
PPTX
Reading group gan - 20170417
PPTX
Gan seminar
PDF
GAN - Theory and Applications
PDF
ddpg seminar
PPTX
ICASSP 2018 Tutorial: Generative Adversarial Network and its Applications to ...
PDF
Deep Generative Models
PPTX
3D Multi Object GAN
PDF
Generative Adversarial Networks 2
PDF
Variational Autoencoded Regression of Visual Data with Generative Adversarial...
Machine Learning with R
Adversarial learning for neural dialogue generation
DiscoGAN - Learning to Discover Cross-Domain Relations with Generative Advers...
Generative Adversarial Network (+Laplacian Pyramid GAN)
Clean, Learn and Visualise data with R
[GAN by Hung-yi Lee]Part 1: General introduction of GAN
Generative Adversarial Networks
Introduction to Generative Adversarial Networks
Basic Generative Adversarial Networks
Generative adversarial networks
Generative adversarial text to image synthesis
Reading group gan - 20170417
Gan seminar
GAN - Theory and Applications
ddpg seminar
ICASSP 2018 Tutorial: Generative Adversarial Network and its Applications to ...
Deep Generative Models
3D Multi Object GAN
Generative Adversarial Networks 2
Variational Autoencoded Regression of Visual Data with Generative Adversarial...
Ad

Similar to Recommendation Engine with In-Database Machine Learning (20)

PDF
C3_W2.pdf
PDF
Graph Gurus Episode 28: In-Database Machine Learning Solution for Real-Time R...
PDF
[한국어] Safe Multi-Agent Reinforcement Learning for Autonomous Driving
PDF
[系列活動] Data exploration with modern R
PPTX
Gradient descent optimizer
PDF
Deep Convolutional GANs - meaning of latent space
PDF
Gradient Boosted Regression Trees in scikit-learn
PDF
The Gremlin Graph Traversal Language
PDF
Recent Progress on Utilizing Tag Information with GANs - StarGAN & TD-GAN
PPTX
Feature Extraction
PPT
Generation of Deepfake images using GAN and Least squares GAN.ppt
PPTX
Leveraging R in Big Data of Mobile Ads (R在行動廣告大數據的應用)
PDF
Building Streaming Recommendation Engines on Apache Spark with Rui Vieira
PPTX
India software developers conference 2013 Bangalore
PDF
Matrix Factorization
PDF
Learning from Computer Simulation to Tackle Real-World Problems
PPTX
Raccomender engines
PDF
A Walk in the GAN Zoo
PDF
Hadoop France meetup Feb2016 : recommendations with spark
PDF
机器学习Adaboost
C3_W2.pdf
Graph Gurus Episode 28: In-Database Machine Learning Solution for Real-Time R...
[한국어] Safe Multi-Agent Reinforcement Learning for Autonomous Driving
[系列活動] Data exploration with modern R
Gradient descent optimizer
Deep Convolutional GANs - meaning of latent space
Gradient Boosted Regression Trees in scikit-learn
The Gremlin Graph Traversal Language
Recent Progress on Utilizing Tag Information with GANs - StarGAN & TD-GAN
Feature Extraction
Generation of Deepfake images using GAN and Least squares GAN.ppt
Leveraging R in Big Data of Mobile Ads (R在行動廣告大數據的應用)
Building Streaming Recommendation Engines on Apache Spark with Rui Vieira
India software developers conference 2013 Bangalore
Matrix Factorization
Learning from Computer Simulation to Tackle Real-World Problems
Raccomender engines
A Walk in the GAN Zoo
Hadoop France meetup Feb2016 : recommendations with spark
机器学习Adaboost
Ad

More from TigerGraph (20)

PDF
MAXIMIZING THE VALUE OF SCIENTIFIC INFORMATION TO ACCELERATE INNOVATION
PDF
Better Together: How Graph database enables easy data integration with Spark ...
PDF
Building an accurate understanding of consumers based on real-world signals
PDF
Care Intervention Assistant - Omaha Clinical Data Information System
PDF
Correspondent Banking Networks
PDF
Delivering Large Scale Real-time Graph Analytics with Dell Infrastructure and...
PDF
Deploying an End-to-End TigerGraph Enterprise Architecture using Kafka, Maria...
PDF
Fraud Detection and Compliance with Graph Learning
PDF
Fraudulent credit card cash-out detection On Graphs
PDF
FROM DATAFRAMES TO GRAPH Data Science with pyTigerGraph
PDF
Customer Experience Management
PDF
Graph+AI for Fin. Services
PDF
Davraz - A graph visualization and exploration software.
PDF
Plume - A Code Property Graph Extraction and Analysis Library
PDF
TigerGraph.js
PDF
GRAPHS FOR THE FUTURE ENERGY SYSTEMS
PDF
Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...
PDF
How to Build An AI Based Customer Data Platform: Learn the design patterns fo...
PDF
Machine Learning Feature Design with TigerGraph 3.0 No-Code GUI
PDF
Supply Chain and Logistics Management with Graph & AI
MAXIMIZING THE VALUE OF SCIENTIFIC INFORMATION TO ACCELERATE INNOVATION
Better Together: How Graph database enables easy data integration with Spark ...
Building an accurate understanding of consumers based on real-world signals
Care Intervention Assistant - Omaha Clinical Data Information System
Correspondent Banking Networks
Delivering Large Scale Real-time Graph Analytics with Dell Infrastructure and...
Deploying an End-to-End TigerGraph Enterprise Architecture using Kafka, Maria...
Fraud Detection and Compliance with Graph Learning
Fraudulent credit card cash-out detection On Graphs
FROM DATAFRAMES TO GRAPH Data Science with pyTigerGraph
Customer Experience Management
Graph+AI for Fin. Services
Davraz - A graph visualization and exploration software.
Plume - A Code Property Graph Extraction and Analysis Library
TigerGraph.js
GRAPHS FOR THE FUTURE ENERGY SYSTEMS
Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...
How to Build An AI Based Customer Data Platform: Learn the design patterns fo...
Machine Learning Feature Design with TigerGraph 3.0 No-Code GUI
Supply Chain and Logistics Management with Graph & AI

Recently uploaded (20)

PPTX
Business Acumen Training GuidePresentation.pptx
PPTX
1_Introduction to advance data techniques.pptx
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PPTX
Computer network topology notes for revision
PPT
Reliability_Chapter_ presentation 1221.5784
PPT
ISS -ESG Data flows What is ESG and HowHow
PDF
Foundation of Data Science unit number two notes
PPTX
Database Infoormation System (DBIS).pptx
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPTX
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
PDF
.pdf is not working space design for the following data for the following dat...
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PPTX
Introduction to machine learning and Linear Models
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPTX
Supervised vs unsupervised machine learning algorithms
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PDF
annual-report-2024-2025 original latest.
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PDF
Mega Projects Data Mega Projects Data
Business Acumen Training GuidePresentation.pptx
1_Introduction to advance data techniques.pptx
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
Computer network topology notes for revision
Reliability_Chapter_ presentation 1221.5784
ISS -ESG Data flows What is ESG and HowHow
Foundation of Data Science unit number two notes
Database Infoormation System (DBIS).pptx
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
.pdf is not working space design for the following data for the following dat...
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
Introduction to machine learning and Linear Models
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Supervised vs unsupervised machine learning algorithms
IBA_Chapter_11_Slides_Final_Accessible.pptx
annual-report-2024-2025 original latest.
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
Mega Projects Data Mega Projects Data

Recommendation Engine with In-Database Machine Learning

  • 1. Recommendation Engine with In-Database Machine Learning Changran Liu, Mingxi Wu
  • 2. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | Outline ● In-database model training ● Latent factor recommendation model ● Distributed model training in Graph ● Demo ● GSQL implementation
  • 3. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | Traditional Model Training Pipeline training data model request results Database: ● data storage ● data update ● preprocess data Machine learning platform ● model training ● model validation Applications: ● place order ● recommendation ● ...
  • 4. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | In-situ machine learning in database: ● No need for exporting data ● Better support continuous model training over evolving data ● Less limitation on model size ● Support distributed model training In-Database Model Training request results Applications: ● recommendation ● fraud detection ● ...
  • 5. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | Movie Recommendation movie features users ratings Goals: ● Predict users' ratings for movies they haven't seen, based on previous ratings ● Recommend movies to users based on rating prediction
  • 6. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | Movie Rating Prediction (Latent factors model) Movie Alice Bob Carol Dave Love at last 5 5 0 0 Romance forever 5 ? ? 0 Cute puppies of love ? 4 0 ? Toy story ? ? ? 5 Sword vs. karate 0 0 5 ? Nonstop car chases 0 0 5 4 ● Each movie has a latent factor vector: θ(j) ● Each user has a latent factor vector: x(i) ● Predict the user j’s rating to movie i by: (θ(j) )T x(i) θ(1) = [5, 0] θ(2) = [5, 0] θ(3) = [0, 5] θ(4) = [0, 5] x(1) = [0.9, 0] x(2) = [1, 0.1] x(3) = [0.9, 0] x(4) = [0.1, 1] x(5) = [0.1, 1] x(6) = [0, 0.9] 4.5 5 4.5 0.5 0.5 0
  • 7. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | Movie Rating Prediction (Latent factors model) Movie Alice Bob Carol Dave Love at last 5 5 0 0 Romance forever 5 ? ? 0 Cute puppies of love ? 4 0 ? Toy story ? ? ? 5 Sword vs. karate 0 0 5 ? Nonstop car chases 0 0 5 4 θ(1) = [5, 0] ● Each movie has a latent factor vector: θ(j) ● Each user has a latent factor vector: x(i) ● Predict the user j’s rating to movie i by: (θ(j) )T x(i) θ(2) = [5, 0] θ(3) = [0, 5] θ(4) = [0, 5] x(1) = [0.9, 0] x(2) = [1, 0.1] x(3) = [0.9, 0] x(4) = [0.1, 1] x(5) = [0.1, 1] x(6) = [0, 0.9] action romance 4.5 5 4.5 0.5 0.5 0
  • 8. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | Cost Function RMSE regularization User 2 Movie 1 Movie 2 Movie 3 User 1 rating: y (1,1) rating: y (1,2) rating: y (2,2) rating: y (2,3) θ(1) θ(2) x(1) x(2) x(3)
  • 9. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | Model Training in Graph User 2 Movie 1 Movie 2 Movie 3 User 1 rating: y(1,1) rating: y (1,2) rating: y (2,2) rating: y(2,3) θ(1) θ(2) x(1) x(2) x(3) Phase 1: ● Collect x(i) , y(i,j) from the movies that each user rated
  • 10. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | Model Training in Graph User 2 Movie 1 Movie 2 Movie 3 User 1 rating: y(1,1) rating: y (1,2) rating: y (2,2) rating: y(2,3) [θ(1) , x(1) , y(1,1) ] [θ(1) , x(2) , y(1,2) ] x(1) x(2) x(3) [θ(2) , x(2) , y(2,2) ] [θ(2) , x(3) , y(2,3) ] Phase 1: ● Collect x(i) , y(i,j) from the movies that each user rated
  • 11. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | Model Training in Graph User 2 Movie 1 Movie 2 Movie 3 User 1 rating: y(1,1) rating: y (1,2) rating: y (2,2) rating: y(2,3) [θ(1) , x(1) , y(1,1) ] →( (θ(1) )T x(1) -y(1,1) ) x(1) [θ(1) , x(2) , y(1,2) ] →( (θ(1) )T x(2) -y(1,1) ) x(2) x(1) x(2) x(3) [θ(2) , x(2) , y(2,2) ] → ((θ(2) )T x(2) -y(2,2) ) x(2) [θ(2) , x(3) , y(2,3) ] → ((θ(2) )T x(3) -y(2,3) ) x(3) Phase 1: ● Collect x(i) , y(i,j) from the movies that each user rated ● Compute the gradient contributed by each movies
  • 12. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | Model Training in Graph User 2 Movie 1 Movie 2 Movie 3 User 1 rating: y(1,1) rating: y (1,2) rating: y (2,2) rating: y(2,3) ( (θ(1) )T x(1) -y(1,1) ) x(1) + ( (θ(1) )T x(2) -y(1,1) ) x(2) x(1) x(2) x(3) ((θ(2) )T x(2) -y(2,2) ) x(2) + ((θ(2) )T x(3) -y(2,3) ) x(3) Phase 1: ● Collect x(i) , y(i,j) from the movies that each user rated ● Compute the gradient contributed by each movies Phase 2: ● Aggregate the gradient
  • 13. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | Model Training in Graph User 2 Movie 1 Movie 2 Movie 3 User 1 rating: y(1,1) rating: y (1,2) rating: y (2,2) rating: y(2,3) x(1) x(2) x(3) Phase 1: ● Collect x(i) , y(i,j) from the movies that each user rated ● Compute the gradient contributed by each movies Phase 2: ● Aggregate the gradient ● Update the feature vector using gradient descent
  • 14. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | Training Split data Initialize latent factor vectors diff. between prediction and label converged? no finish yes update latent vectors using gradient descent (splitData.gsql) (initialization.gsql) (training.gsql)
  • 15. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | Demo
  • 16. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | GSQL Training Block USERs = SELECT s FROM USERs:s -(rate:e)-> MOVIE:t ACCUM DOUBLE prediction = dotProduct(s.@theta,t.@x), DOUBLE delta = prediction-e.rating, s.@Gradient += product(t.@x,delta), t.@Gradient += product(s.@theta,delta) POST-ACCUM s.@theta += product(s.@Gradient,-alpha), t.@x += product(t.@Gradient,-alpha); Dave Romance forever Love at last Nonstop car chases Alice rating: 5 rating: 5 rating: 0 rating: 4 θ = [1.5, 1.7] θ = [1.0, 1.5] x = [2.0, 2.3] x = [2.0, 1.3] x = [1.0, 1.3]
  • 17. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | GSQL Training Block USERs = SELECT s FROM USERs:s -(rate:e)-> MOVIE:t ACCUM DOUBLE prediction = dotProduct(s.@theta,t.@x), DOUBLE delta = prediction-e.rating, s.@Gradient += product(t.@x,delta), t.@Gradient += product(s.@theta,delta) POST-ACCUM s.@theta += product(s.@Gradient,-alpha), t.@x += product(t.@Gradient,-alpha); Dave Romance forever Love at last Nonstop car chases Alice rating: 5 rating: 5 rating: 0 rating: 4 θ = [1.5, 1.7] θ = [1.0, 1.5] x = [2.0, 2.3] x = [2.0, 1.3] x = [1.0, 1.3]
  • 18. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | GSQL Training Block USERs = SELECT s FROM USERs:s -(rate:e)-> MOVIE:t ACCUM DOUBLE prediction = dotProduct(s.@theta,t.@x), DOUBLE delta = prediction-e.rating, s.@Gradient += product(t.@x,delta), t.@Gradient += product(s.@theta,delta) POST-ACCUM s.@theta += product(s.@Gradient,-alpha), t.@x += product(t.@Gradient,-alpha); Dave Romance forever Love at last Nonstop car chases Alice rating: 5 rating: 5 rating: 0 rating: 4 θ = [1.5, 1.7] θ = [1.0, 1.5] x = [2.0, 2.3] x = [2.0, 1.3] x = [1.0, 1.3] prediction: 6.9 prediction: 5.2 prediction: 4.0 prediction: 3.0
  • 19. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | GSQL Training Block USERs = SELECT s FROM USERs:s -(rate:e)-> MOVIE:t ACCUM DOUBLE prediction = dotProduct(s.@theta,t.@x), DOUBLE delta = prediction-e.rating, s.@Gradient += product(t.@x,delta), t.@Gradient += product(s.@theta,delta) POST-ACCUM s.@theta += product(s.@Gradient,-alpha), t.@x += product(t.@Gradient,-alpha); Dave Romance forever Love at last Nonstop car chases Alice ẟ: 1.9 ẟ: 0.2 ẟ: 4.0 ẟ: -1.1 θ = [1.5, 1.7] θ = [1.0, 1.5] x = [2.0, 2.3] x = [2.0, 1.3] x = [1.0, 1.3]
  • 20. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | GSQL Training Block USERs = SELECT s FROM USERs:s -(rate:e)-> MOVIE:t ACCUM DOUBLE prediction = dotProduct(s.@theta,t.@x), DOUBLE delta = prediction-e.rating, s.@Gradient += product(t.@x,delta), t.@Gradient += product(s.@theta,delta) POST-ACCUM s.@theta += product(s.@Gradient,-alpha), t.@x += product(t.@Gradient,-alpha); Dave Romance forever Love at last Nonstop car chases Alice ẟ: 1.9 ẟ: 0.2 ẟ: 4.0 ẟ: -1.1 θ = [1.5, 1.7] grad(θ) = [4.2, 4.7] θ = [1.0, 1.5] grad(θ) = [6.9, 3.8] x = [2.0, 2.3] x = [2.0, 1.3] x = [1.0, 1.3]
  • 21. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | GSQL Training Block USERs = SELECT s FROM USERs:s -(rate:e)-> MOVIE:t ACCUM DOUBLE prediction = dotProduct(s.@theta,t.@x), DOUBLE delta = prediction-e.rating, s.@Gradient += product(t.@x,delta), t.@Gradient += product(s.@theta,delta) POST-ACCUM s.@theta += product(s.@Gradient,-alpha), t.@x += product(t.@Gradient,-alpha); Dave Romance forever Love at last Nonstop car chases Alice ẟ: 1.9 ẟ: 0.2 ẟ: 4.0 ẟ: -1.1 θ = [1.5, 1.7] grad(θ) = [4.2, 4.7] θ = [1.0, 1.5] grad(θ) = [6.9, 3.8] x = [2.0, 2.3] grad(x) = [2.9, 3.2] x = [2.0, 1.3] grad(x) = [4.3, 6.3] x = [1.0, 1.3] grad(x) = [-1.1, -1.6]
  • 22. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | GSQL Training Block USERs = SELECT s FROM USERs:s -(rate:e)-> MOVIE:t ACCUM DOUBLE prediction = dotProduct(s.@theta,t.@x), DOUBLE delta = prediction-e.rating, s.@Gradient += product(t.@x,delta), t.@Gradient += product(s.@theta,delta) POST-ACCUM s.@theta += product(s.@Gradient,-alpha), t.@x += product(t.@Gradient,-alpha); Dave Romance forever Love at last Nonstop car chases Alice ẟ: 1.9 ẟ: 0.2 ẟ: 4.0 ẟ: -1.1 θ = [1.5, 1.7] θ’ = [1.46, 1.65] θ = [1.0, 1.5] θ’ = [0.93, 1.46] x = [2.0, 2.3] x’ = [1.97, 2.27] x = [2.0, 1.3] x’ = [1.96, 1.24] x = [1.0, 1.3] x’ = [1.01, 1.32] * alpha = 0.01
  • 23. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | GSQL Training Block USERs = SELECT s FROM USERs:s -(rate:e)-> MOVIE:t ACCUM DOUBLE prediction = dotProduct(s.@theta,t.@x), DOUBLE delta = prediction-e.rating, s.@Gradient += product(t.@x,delta), t.@Gradient += product(s.@theta,delta) POST-ACCUM s.@theta += product(s.@Gradient,-alpha), t.@x += product(t.@Gradient,-alpha); Dave Romance forever Love at last Nonstop car chases Alice ẟ: 1.9 ẟ’: 1.6 ẟ: 0.2 ẟ’: -0.1 ẟ: 4.0 ẟ’: 3.6 ẟ: -1.1 ẟ’: -1.1 θ = [1.5, 1.7] θ’ = [1.46, 1.65] θ = [1.0, 1.5] θ’ = [0.93, 1.46] x = [2.0, 2.3] x’ = [1.97, 2.27] x = [2.0, 1.3] x’ = [1.96, 1.24] x = [1.0, 1.3] x’ = [1.01, 1.32]
  • 24. Q&A
  • 25. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | User-Rate-Movie Graph ● Content based method Toy story ● Disney ● ... Iron man ● Marvel ● Action ● ... Alice ● Disney fan ● Marvel fan ● ... Bob ● Marvel fan ● ... rating: 5 rating: 5 rating:4.5 rating:?
  • 26. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | User-Rate-Movie Graph ● Content based method Toy story ● Disney ● ... Iron man ● Marvel ● Action ● ... Alice ● Disney fan ● Marvel fan ● ... Bob ● Marvel fan ● ... rating: 5 rating: 5 rating:4.5 rating:? ● K-nearest neighbors
  • 27. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | User-Rate-Movie Graph ● Content based method ● K-nearest neighbors Toy story ● Disney ● ... Iron man ● Marvel ● Action ● ... Alice ● Disney fan ● Marvel fan ● ... Bob ● Marvel fan ● ... rating: 5 rating: 5 rating:? ● Latent factor (model-based)
  • 28. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | User-Rate-Movie Graph ● Content based method ● K-nearest neighbors ● Latent factor (model-based) ● Hybrid method ● ... Toy story ● Disney ● ... Iron man ● Marvel ● Action ● ... Alice ● Disney fan ● Marvel fan ● ... Bob ● Marvel fan ● ... rating: 5 rating: 5 rating:?
  • 29. | GRAPHAIWORLD.COM | #GRAPHAIWORLD | User-Rate-Movie Graph ● Content based method ● K-nearest neighbors ● Latent factor (model-based) ● Hybrid method ● ... Toy story ● Disney ● ... Iron man ● Marvel ● Action ● ... Alice ● Disney fan ● Marvel fan ● ... Bob ● Marvel fan ● ... rating: 5 rating: 5 rating:?