SlideShare a Scribd company logo
Smarter AI with
Analytical Graph
Databases
17 December 2020
Victor Lee
Head of Product Strategy, TigerGraph
1
© 2020. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
2
“Graph analysis is possibly the single most
effective competitive differentiator for
organizations pursuing data-driven
operations and decisions after the
design of data capture.”
© 2020. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
AGENDA
● What is an Analytical Graph Database?
● Integrated Knowledge, More Insight
● Three Basic Approaches for Graph + AI, with Use-Case Examples
○ Unsupervised Learning
○ Feature Enrichment from Graph Features
○ In-Database Learning
3
© 2020. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
Richer, Smarter Data
● Connections-as-data
● Connects different datasets, breaks down silos
Deeper, Smarter Questions
● Look for semantic patterns of relationship
● Search far & wide more easily & faster than other
DBs
More Computational Options
● Graph algorithms
● Graph-enhanced machine learning
Explainable Results
● Semantic data model, queries, and answers
● Visual exploration and results
4
Graph+AI Delivers More Value, Better Results
Customer
Supplier
Location 2
Product
Payment
PURCHASED
RESIDESSHIPS
TO
PURCHASED
SHIPS FROM
A
C
C
EPTED
MAKES
Location 1
N
O
TIFIES
Order
KNOWS
© 2020. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
Types of Graph Databases
5
Transactional
Analytics
Property Graphs:
• Higher performance for
queries, transactions, and
advanced analytics
• Pattern matching
• Schema-free or
Schema-based
• Schema-based allows
application-specific tuning
Semantic (RDF)
Knowledge Graph:
• Collection of facts
(RDF triples)
• Ontology to model
concepts & rules
• Pattern matching
• Logical inference
• Standards-based
TigerGraph is a High-Performance and Scalable Property Graph, for both Analytics & Transactions.
© 2020. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
Real-World Better Outcomes from Graph+AI
Healthcare:
Real-time
recommendation
● 1.3TB graph brain
● Real-time care
recommendations
● Saving $150M/yr, more
6
Industrial Supply Chain:
Analytics for decisions
● Analytics: weeks→ minutes
● Reveal opportunities,
optimize tactical & strategic
decisions
● Saving $25M+/yr
Financial Services:
Real-time fraud detection
● Integrates multiple tools
● "Magical" real-time visual
results for investigators
● Scalable for growth
© 2020. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
Types of Graph Algorithms
● Path Finding
● Clustering / Community Detection
○ Lenient clustering - connected component: one connection
○ Strict clustering - clique detection: every possible connection
○ Relative density - more connections in-group than
between-group
● Ranking and Centrality
○ PageRank, HITS
○ SimRank, RoleSim
● Frequent Pattern Discovery
○ Agglomerative search
Case 1: Analytical Queries & Graph Algorithms
7
BOLD indicates more complex tasks, with
iterative algorithms, which can be
considered unsupervised learning
© 2020. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | 8
How do I find the most influential provider in each
region for a particular medical condition?
Whole-Graph Compute problem
1. Analyze claims data to identify referral relationships
among providers (Time Series Analysis)
2. Create subsets of claims around each condition with a
group of healthcare codes (e.g. CPT codes) for each
region (e.g. local healthcare market)
3. Utilize PageRank to score hubs within each market
Dr. Thomas
Condition: Diabetes
Healthcare Market: S. San Jose, CA
Hub Identified: Dr. Thomas
© 2020. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | 9
Who is influenced by these leaders (e.g. other doctors,
chiropractors, physical therapists, facilities)?
Utilize Community Detection
1. Identify communities of providers
around each hub for each region and for a
specific condition
2. Track changes over time to detect
significant shifts in communities Dr. Thomas
Condition: Diabetes
Healthcare Market: S. San Jose, CA
Hub Identified: Dr. Thomas
Community Detected: Diabetes – S. San Jose – Dr. Thomas
© 2020. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | 10
Find Similar Cases to deliver better healthcare
● Seamlessly integrate
multiple sources of
data to provide unified
and comprehensive
view for each journey
among 50M Medicare
members
● Find similar members
with a click of a button
in real-time
● Deliver care path
recommendations for
similar members
© 2020. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | 11
Graph-Based Structural Similarity
Use a vertex's neighbors as its feature set
● Cosine: Use edge weights to each neighboring vertex
A
OrderY
B
ZW X
2
3 1
2
14
A's weighted neighbors = {4,1,2,0}
B's weighted neighbors = {0,2,3,1}
Cos(A,B) = 8 / [√21√14] = 0.4666
W,X,Y,Z represent feature vertices, different vertex types than A,B
© 2020. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
Case 2: Graph Feature Extraction
Challenge
Find and report fraudsters among billions of calls per week.
Solution
● Build graph: Real-time operational graph with 600M
phone nodes & 15B call detail records.
● Get features and labels: Domain experts write GSQL
queries to extract 118 features/phone. Some past calls are
labeled for 3 types of unwanted calls.
● Train: Feed machine learning with training data for fraud
detection with 118 features/phone for 30M calls.
● Deploy: For each incoming call, extract the current 118
features (subsecond) and apply model for real-time
answer.
Results
● If unwanted call is predict, display alert on recipient's
phone
● Process 2000+ calls/sec
● Improved customer satisfaction 12
Customer: China Mobile
© 2020. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
Case 2: Graph Feature Extraction
13
© 2020. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
Using Graph-Based Features for Machine Learning In Healthcare:
Good Doctor - Bad Doctor
14
© 2020. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
Powering Explainable AI with Graph Database
Additional details at https://www.tigergraph.com/solutions/ai-and-machine-learning/ 15
© 2020. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
Case 3: In-Database Machine Learning
16
● Native graph storage and PG model
● Coded once, auto scale-out & scale-up
● Real-time updates
● GSQL Turing-complete language
○ Preprocess data
○ Training: flow-control, accumulator, pattern match
○ Model validation
Applications:
● Entity resolution
● Recommendation
● Fraud detection
● ...
Training
data
GSQL ML
Algorithm
TRAIN DEPLOY
Model Prediction
Query
© 2020 TigerGraph. All Rights Reserved
In Database ML for Movie Recommendation
movie features
users ratings
Goals:
● Predict users' ratings for movies,
based on previous ratings
● Recommend movies to users
based on rating prediction
© 2020 TigerGraph. All Rights Reserved
User-Rating-Movie Graph
Recommendation Approaches
● Collaborative filtering
● Content based method
● K-nearest neighbors
● Latent factor (model-based)
● Hybrid method
● ...
rating: 5
rating: 5
rating:?
Toy story
● Disney
● ...
Iron man
● Marvel
● Action
● ...
Alice
● Disney fan
● Marvel fan
● ...
Bob
● Marvel fan
● ...
MovieLens dataset
https://grouplens.org/datasets/movielens/
● 100K ratings and 40K tags that
1K users gave to 17K movies
● Ratings are from 0 to 5 stars
© 2020 TigerGraph. All Rights Reserved
Movie Rating Prediction (Latent factors model)
Movie Alice Bob Carol Dave
Love at last 5 5 0 0
Romance forever 5 ? ? 0
Cute puppies of love ? 4 0 ?
Toy story ? ? ? 5
Sword vs. karate 0 0 5 ?
Nonstop car chases 0 0 5 4
θ(1)
= [5, 0]
● Each movie has a latent
factor vector: θ(j)
● Each user has a latent
factor vector: x(i)
● Predict the user j’s rating to
movie i by: (θ(j)
)T
x(i)
θ(2)
= [5, 0] θ(3)
= [0, 5] θ(4)
= [0, 5]
x(1)
= [0.9, 0]
x(2)
= [1, 0.1]
x(3)
= [0.9, 0]
x(4)
= [0.1, 1]
x(5)
= [0.1, 1]
x(6)
= [0, 0.9]
action
romance
4.5
5
4.5
0.5
0.5
0
© 2020. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
The Future of In-Graph ML
Neural Networks for Graphs
20
https://medium.com/@terngoodod/a-comprehensive-survey-on-graph-neural-networks-part-1-types-of-graph-neural-network-1dd93b823c70
© 2020 TigerGraph. All Rights Reserved
Basic Conventional Neural Network
Input layer = FeatureMatrix X = FeatureVector * Nodes
HiddenLayer H1 = Activation_Function(P * X)
HiddenLayer H2 = Activation_Function(P *H1)
21
Propagation P =
weighted edges
© 2020 TigerGraph. All Rights Reserved
Graph Convolutional Network
Input layer = FeatureMatrix X = FeatureVector * Nodes
HiddenLayer H1 = Activation_Function(A*X*P), A = Adjacency Matrix (Edges)
HiddenLayer H2 = Activation_Function(A*H1*P)
22
Propagation P =
weighted edges
© 2020. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION |
The TigerGraph Difference
Feature Design Difference Benefit
Real-Time Deep-Link Querying ● Native Graph design
● C++ engine, for high performance
● Storage Architecture
● Uncovers hard-to-find patterns
● Operational, real-time
● HTAP: Transactions+Analytics
Handling Massive Scale ● Distributed DB architecture
● Massively parallel processing
● Compressed storage reduces
footprint and messaging
● Integrates all your data
● Automatic partitioning
● Elastic scaling of resource usage
In-Database Analytics ● GSQL: High-level yet
Turing-complete language
● User-extensible graph algorithm
library, runs in-DB
● ACID (OLTP) and Accumulators
(OLAP)
● Avoids transferring data
● Richer graph context
● In-DB machine learning
5 to 10+ hops deep
23
© 2020 TigerGraph. All Rights Reserved
Summary for "Why Graph for ML/AI?"
24
● Natural Data Model - Graph is how we think
● Richer data - connections between entities, graph-based features
● Graphs have always had a natural role in machine learning:
○ Unsupervised learning through graph algorithms, frequent pattern mining
○ Learning through neural networks and deep learning
● Graph data models are uniquely qualified to provide explanatory AI.
● Native Graphs with Massively Parallel Processing like TigerGraph enable large
scale feature extraction and in-graph analytics
© 2020. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | 25
Q & A
Dr. Victor Lee
Head of Product Strategy & Developer Relations
victor@tigergraph.com
Thank You

More Related Content

PDF
Scaling up business value with real-time operational graph analytics
PDF
Graph intelligence: the future of data-driven investigations
PDF
Using Graph Algorithms For Advanced Analytics - Part 4 Similarity 30 graph al...
PDF
Using Knowledge Graphs to Predict Customer Needs and Improve Quality
PPTX
The years of the graph: The future of the future is here
PDF
Synthetic data generation for machine learning
PDF
Graph Gurus Episode 26: Using Graph Algorithms for Advanced Analytics Part 1
PDF
Graph Gurus Episode 17: Seven Key Data Science Capabilities Powered by a Nati...
Scaling up business value with real-time operational graph analytics
Graph intelligence: the future of data-driven investigations
Using Graph Algorithms For Advanced Analytics - Part 4 Similarity 30 graph al...
Using Knowledge Graphs to Predict Customer Needs and Improve Quality
The years of the graph: The future of the future is here
Synthetic data generation for machine learning
Graph Gurus Episode 26: Using Graph Algorithms for Advanced Analytics Part 1
Graph Gurus Episode 17: Seven Key Data Science Capabilities Powered by a Nati...

What's hot (19)

PDF
Knowledge graphs, meet Deep Learning
PDF
4. Document Discovery with Graph Data Science
PDF
Your Roadmap for An Enterprise Graph Strategy
PDF
Neo4j Graph Data Science Training - June 9 & 10 - Slides #7 GDS Best Practices
PDF
Graph Gurus Episode 27: Using Graph Algorithms for Advanced Analytics Part 2
PDF
Using Graph Algorithms for Advanced Analytics - Part 2 Centrality
PDF
From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowle...
PDF
Graph Databases and Machine Learning | November 2018
PDF
Smart Data Webinar: Machine Learning Update
PDF
Using Graph Algorithms for Advanced Analytics - Part 5 Classification
PDF
Graph technology meetup slides
PDF
QuantUniversity Machine Learning in Finance Course
PPTX
BrightTALK - Semantic AI
PDF
Combining a Knowledge Graph and Graph Algorithms to Find Hidden Skills at NASA
PDF
Rapid prototyping quant research ml models using the qu sandbox
PDF
Enterprise Knowledge Graph
PPTX
Real-time Big Data Analytics: From Deployment to Production
PDF
The Role of Selfies in Creating the Next Generation Computer Vision Infused O...
PDF
big-data-analytics-and-iot-in-logistics-a-case-study-2018.pdf
Knowledge graphs, meet Deep Learning
4. Document Discovery with Graph Data Science
Your Roadmap for An Enterprise Graph Strategy
Neo4j Graph Data Science Training - June 9 & 10 - Slides #7 GDS Best Practices
Graph Gurus Episode 27: Using Graph Algorithms for Advanced Analytics Part 2
Using Graph Algorithms for Advanced Analytics - Part 2 Centrality
From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowle...
Graph Databases and Machine Learning | November 2018
Smart Data Webinar: Machine Learning Update
Using Graph Algorithms for Advanced Analytics - Part 5 Classification
Graph technology meetup slides
QuantUniversity Machine Learning in Finance Course
BrightTALK - Semantic AI
Combining a Knowledge Graph and Graph Algorithms to Find Hidden Skills at NASA
Rapid prototyping quant research ml models using the qu sandbox
Enterprise Knowledge Graph
Real-time Big Data Analytics: From Deployment to Production
The Role of Selfies in Creating the Next Generation Computer Vision Infused O...
big-data-analytics-and-iot-in-logistics-a-case-study-2018.pdf
Ad

Similar to Shift Remote: AI: Smarter AI with analytical graph databases - Victor Lee (Tigergraph) (20)

PDF
Graph Gurus Episode 29: Using Graph Algorithms for Advanced Analytics Part 3
PDF
Graph+AI for Fin. Services
PPTX
Tiger graph 2021 corporate overview [read only]
PDF
TigerGraph UI Toolkits Financial Crimes
PDF
Graph Gurus Episode 37: Modeling for Kaggle COVID-19 Dataset
PDF
How Graphs Continue to Revolutionize The Prevention of Financial Crime & Frau...
PPTX
ISC2 Privacy-Preserving Analytics and Secure Multiparty Computation
PDF
Deeper Insights with Graph Data Science
PDF
Graph Gurus Episode 32: Using Graph Algorithms for Advanced Analytics Part 5
PDF
The Knowledge Graph Explosion
PPTX
New technologies for data protection
PDF
Are You Underestimating the Value Within Your Data? A conversation about grap...
PPTX
Using Connected Data and Graph Technology to Enhance Machine Learning and Art...
PDF
Unfolding the Credit Card Fraud Detection Technique by Implementing SVM Algor...
PDF
Real-Time Fraud Detection at Scale—Integrating Real-Time Deep-Link Graph Anal...
PPTX
Protecting data privacy in analytics and machine learning ISACA London UK
PDF
Real-Time Graph Analytics in Power BI.pdf
PPTX
Privacy preserving computing and secure multi-party computation ISACA Atlanta
PDF
IRJET - Company’s Stock Price Predictor using Machine Learning
PDF
How to Build An AI Based Customer Data Platform: Learn the design patterns fo...
Graph Gurus Episode 29: Using Graph Algorithms for Advanced Analytics Part 3
Graph+AI for Fin. Services
Tiger graph 2021 corporate overview [read only]
TigerGraph UI Toolkits Financial Crimes
Graph Gurus Episode 37: Modeling for Kaggle COVID-19 Dataset
How Graphs Continue to Revolutionize The Prevention of Financial Crime & Frau...
ISC2 Privacy-Preserving Analytics and Secure Multiparty Computation
Deeper Insights with Graph Data Science
Graph Gurus Episode 32: Using Graph Algorithms for Advanced Analytics Part 5
The Knowledge Graph Explosion
New technologies for data protection
Are You Underestimating the Value Within Your Data? A conversation about grap...
Using Connected Data and Graph Technology to Enhance Machine Learning and Art...
Unfolding the Credit Card Fraud Detection Technique by Implementing SVM Algor...
Real-Time Fraud Detection at Scale—Integrating Real-Time Deep-Link Graph Anal...
Protecting data privacy in analytics and machine learning ISACA London UK
Real-Time Graph Analytics in Power BI.pdf
Privacy preserving computing and secure multi-party computation ISACA Atlanta
IRJET - Company’s Stock Price Predictor using Machine Learning
How to Build An AI Based Customer Data Platform: Learn the design patterns fo...
Ad

More from Shift Conference (20)

PDF
Shift Remote: AI: How Does Face Recognition Work (ars futura)
PDF
Shift Remote: AI: Behind the scenes development in an AI company - Matija Ili...
PDF
Shift Remote: DevOps: Devops with Azure Devops and Github - Juarez Junior (Mi...
PDF
Shift Remote: DevOps: Autodesks research into digital twins for AEC - Kean W...
PPTX
Shift Remote: DevOps: When metrics are not enough, and everyone is on-call - ...
PDF
Shift Remote: DevOps: Modern incident management with opsgenie - Kristijan L...
PDF
Shift Remote: DevOps: Gitlab ci hands-on experience - Ivan Rimac (Barrage)
PDF
Shift Remote: DevOps: DevOps Heroes - Adding Advanced Automation to your Tool...
PDF
Shift Remote: DevOps: An (Un)expected Journey - Zeljko Margeta (RBA)
PDF
Shift Remote: Game Dev - Localising Mobile Games - Marta Kunic (Nanobit)
PDF
Shift Remote: Game Dev - Challenges Introducing Open Source to the Games Indu...
PDF
Shift Remote: Game Dev - Ghost in the Machine: Authorial Voice in System Desi...
PDF
Shift Remote: Game Dev - Building Better Worlds with Game Culturalization - K...
PPTX
Shift Remote: Game Dev - Open Match: An Open Source Matchmaking Framework - J...
PDF
Shift Remote: Game Dev - Designing Inside the Box - Fernando Reyes Medina (34...
PDF
Shift Remote: Mobile - Efficiently Building Native Frameworks for Multiple Pl...
PDF
Shift Remote: Mobile - Introduction to MotionLayout on Android - Denis Fodor ...
PDF
Shift Remote: Mobile - Devops-ify your life with Github Actions - Nicola Cort...
PPTX
Shift Remote: WEB - GraphQL and React – Quick Start - Dubravko Bogovic (Infobip)
PDF
Shift Remote: WEB - The Features of WebXR Beyond Virtual Reality - Ada Rose C...
Shift Remote: AI: How Does Face Recognition Work (ars futura)
Shift Remote: AI: Behind the scenes development in an AI company - Matija Ili...
Shift Remote: DevOps: Devops with Azure Devops and Github - Juarez Junior (Mi...
Shift Remote: DevOps: Autodesks research into digital twins for AEC - Kean W...
Shift Remote: DevOps: When metrics are not enough, and everyone is on-call - ...
Shift Remote: DevOps: Modern incident management with opsgenie - Kristijan L...
Shift Remote: DevOps: Gitlab ci hands-on experience - Ivan Rimac (Barrage)
Shift Remote: DevOps: DevOps Heroes - Adding Advanced Automation to your Tool...
Shift Remote: DevOps: An (Un)expected Journey - Zeljko Margeta (RBA)
Shift Remote: Game Dev - Localising Mobile Games - Marta Kunic (Nanobit)
Shift Remote: Game Dev - Challenges Introducing Open Source to the Games Indu...
Shift Remote: Game Dev - Ghost in the Machine: Authorial Voice in System Desi...
Shift Remote: Game Dev - Building Better Worlds with Game Culturalization - K...
Shift Remote: Game Dev - Open Match: An Open Source Matchmaking Framework - J...
Shift Remote: Game Dev - Designing Inside the Box - Fernando Reyes Medina (34...
Shift Remote: Mobile - Efficiently Building Native Frameworks for Multiple Pl...
Shift Remote: Mobile - Introduction to MotionLayout on Android - Denis Fodor ...
Shift Remote: Mobile - Devops-ify your life with Github Actions - Nicola Cort...
Shift Remote: WEB - GraphQL and React – Quick Start - Dubravko Bogovic (Infobip)
Shift Remote: WEB - The Features of WebXR Beyond Virtual Reality - Ada Rose C...

Recently uploaded (20)

PDF
Empathic Computing: Creating Shared Understanding
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PPTX
Big Data Technologies - Introduction.pptx
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PPTX
Spectroscopy.pptx food analysis technology
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Unlocking AI with Model Context Protocol (MCP)
PPT
Teaching material agriculture food technology
Empathic Computing: Creating Shared Understanding
Advanced methodologies resolving dimensionality complications for autism neur...
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
20250228 LYD VKU AI Blended-Learning.pptx
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Big Data Technologies - Introduction.pptx
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
MIND Revenue Release Quarter 2 2025 Press Release
Understanding_Digital_Forensics_Presentation.pptx
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
NewMind AI Weekly Chronicles - August'25 Week I
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Spectroscopy.pptx food analysis technology
Reach Out and Touch Someone: Haptics and Empathic Computing
Spectral efficient network and resource selection model in 5G networks
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
MYSQL Presentation for SQL database connectivity
Unlocking AI with Model Context Protocol (MCP)
Teaching material agriculture food technology

Shift Remote: AI: Smarter AI with analytical graph databases - Victor Lee (Tigergraph)

  • 1. Smarter AI with Analytical Graph Databases 17 December 2020 Victor Lee Head of Product Strategy, TigerGraph 1
  • 2. © 2020. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | 2 “Graph analysis is possibly the single most effective competitive differentiator for organizations pursuing data-driven operations and decisions after the design of data capture.”
  • 3. © 2020. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | AGENDA ● What is an Analytical Graph Database? ● Integrated Knowledge, More Insight ● Three Basic Approaches for Graph + AI, with Use-Case Examples ○ Unsupervised Learning ○ Feature Enrichment from Graph Features ○ In-Database Learning 3
  • 4. © 2020. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | Richer, Smarter Data ● Connections-as-data ● Connects different datasets, breaks down silos Deeper, Smarter Questions ● Look for semantic patterns of relationship ● Search far & wide more easily & faster than other DBs More Computational Options ● Graph algorithms ● Graph-enhanced machine learning Explainable Results ● Semantic data model, queries, and answers ● Visual exploration and results 4 Graph+AI Delivers More Value, Better Results Customer Supplier Location 2 Product Payment PURCHASED RESIDESSHIPS TO PURCHASED SHIPS FROM A C C EPTED MAKES Location 1 N O TIFIES Order KNOWS
  • 5. © 2020. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | Types of Graph Databases 5 Transactional Analytics Property Graphs: • Higher performance for queries, transactions, and advanced analytics • Pattern matching • Schema-free or Schema-based • Schema-based allows application-specific tuning Semantic (RDF) Knowledge Graph: • Collection of facts (RDF triples) • Ontology to model concepts & rules • Pattern matching • Logical inference • Standards-based TigerGraph is a High-Performance and Scalable Property Graph, for both Analytics & Transactions.
  • 6. © 2020. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | Real-World Better Outcomes from Graph+AI Healthcare: Real-time recommendation ● 1.3TB graph brain ● Real-time care recommendations ● Saving $150M/yr, more 6 Industrial Supply Chain: Analytics for decisions ● Analytics: weeks→ minutes ● Reveal opportunities, optimize tactical & strategic decisions ● Saving $25M+/yr Financial Services: Real-time fraud detection ● Integrates multiple tools ● "Magical" real-time visual results for investigators ● Scalable for growth
  • 7. © 2020. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | Types of Graph Algorithms ● Path Finding ● Clustering / Community Detection ○ Lenient clustering - connected component: one connection ○ Strict clustering - clique detection: every possible connection ○ Relative density - more connections in-group than between-group ● Ranking and Centrality ○ PageRank, HITS ○ SimRank, RoleSim ● Frequent Pattern Discovery ○ Agglomerative search Case 1: Analytical Queries & Graph Algorithms 7 BOLD indicates more complex tasks, with iterative algorithms, which can be considered unsupervised learning
  • 8. © 2020. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | 8 How do I find the most influential provider in each region for a particular medical condition? Whole-Graph Compute problem 1. Analyze claims data to identify referral relationships among providers (Time Series Analysis) 2. Create subsets of claims around each condition with a group of healthcare codes (e.g. CPT codes) for each region (e.g. local healthcare market) 3. Utilize PageRank to score hubs within each market Dr. Thomas Condition: Diabetes Healthcare Market: S. San Jose, CA Hub Identified: Dr. Thomas
  • 9. © 2020. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | 9 Who is influenced by these leaders (e.g. other doctors, chiropractors, physical therapists, facilities)? Utilize Community Detection 1. Identify communities of providers around each hub for each region and for a specific condition 2. Track changes over time to detect significant shifts in communities Dr. Thomas Condition: Diabetes Healthcare Market: S. San Jose, CA Hub Identified: Dr. Thomas Community Detected: Diabetes – S. San Jose – Dr. Thomas
  • 10. © 2020. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | 10 Find Similar Cases to deliver better healthcare ● Seamlessly integrate multiple sources of data to provide unified and comprehensive view for each journey among 50M Medicare members ● Find similar members with a click of a button in real-time ● Deliver care path recommendations for similar members
  • 11. © 2020. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | 11 Graph-Based Structural Similarity Use a vertex's neighbors as its feature set ● Cosine: Use edge weights to each neighboring vertex A OrderY B ZW X 2 3 1 2 14 A's weighted neighbors = {4,1,2,0} B's weighted neighbors = {0,2,3,1} Cos(A,B) = 8 / [√21√14] = 0.4666 W,X,Y,Z represent feature vertices, different vertex types than A,B
  • 12. © 2020. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | Case 2: Graph Feature Extraction Challenge Find and report fraudsters among billions of calls per week. Solution ● Build graph: Real-time operational graph with 600M phone nodes & 15B call detail records. ● Get features and labels: Domain experts write GSQL queries to extract 118 features/phone. Some past calls are labeled for 3 types of unwanted calls. ● Train: Feed machine learning with training data for fraud detection with 118 features/phone for 30M calls. ● Deploy: For each incoming call, extract the current 118 features (subsecond) and apply model for real-time answer. Results ● If unwanted call is predict, display alert on recipient's phone ● Process 2000+ calls/sec ● Improved customer satisfaction 12 Customer: China Mobile
  • 13. © 2020. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | Case 2: Graph Feature Extraction 13
  • 14. © 2020. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | Using Graph-Based Features for Machine Learning In Healthcare: Good Doctor - Bad Doctor 14
  • 15. © 2020. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | Powering Explainable AI with Graph Database Additional details at https://www.tigergraph.com/solutions/ai-and-machine-learning/ 15
  • 16. © 2020. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | Case 3: In-Database Machine Learning 16 ● Native graph storage and PG model ● Coded once, auto scale-out & scale-up ● Real-time updates ● GSQL Turing-complete language ○ Preprocess data ○ Training: flow-control, accumulator, pattern match ○ Model validation Applications: ● Entity resolution ● Recommendation ● Fraud detection ● ... Training data GSQL ML Algorithm TRAIN DEPLOY Model Prediction Query
  • 17. © 2020 TigerGraph. All Rights Reserved In Database ML for Movie Recommendation movie features users ratings Goals: ● Predict users' ratings for movies, based on previous ratings ● Recommend movies to users based on rating prediction
  • 18. © 2020 TigerGraph. All Rights Reserved User-Rating-Movie Graph Recommendation Approaches ● Collaborative filtering ● Content based method ● K-nearest neighbors ● Latent factor (model-based) ● Hybrid method ● ... rating: 5 rating: 5 rating:? Toy story ● Disney ● ... Iron man ● Marvel ● Action ● ... Alice ● Disney fan ● Marvel fan ● ... Bob ● Marvel fan ● ... MovieLens dataset https://grouplens.org/datasets/movielens/ ● 100K ratings and 40K tags that 1K users gave to 17K movies ● Ratings are from 0 to 5 stars
  • 19. © 2020 TigerGraph. All Rights Reserved Movie Rating Prediction (Latent factors model) Movie Alice Bob Carol Dave Love at last 5 5 0 0 Romance forever 5 ? ? 0 Cute puppies of love ? 4 0 ? Toy story ? ? ? 5 Sword vs. karate 0 0 5 ? Nonstop car chases 0 0 5 4 θ(1) = [5, 0] ● Each movie has a latent factor vector: θ(j) ● Each user has a latent factor vector: x(i) ● Predict the user j’s rating to movie i by: (θ(j) )T x(i) θ(2) = [5, 0] θ(3) = [0, 5] θ(4) = [0, 5] x(1) = [0.9, 0] x(2) = [1, 0.1] x(3) = [0.9, 0] x(4) = [0.1, 1] x(5) = [0.1, 1] x(6) = [0, 0.9] action romance 4.5 5 4.5 0.5 0.5 0
  • 20. © 2020. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | The Future of In-Graph ML Neural Networks for Graphs 20 https://medium.com/@terngoodod/a-comprehensive-survey-on-graph-neural-networks-part-1-types-of-graph-neural-network-1dd93b823c70
  • 21. © 2020 TigerGraph. All Rights Reserved Basic Conventional Neural Network Input layer = FeatureMatrix X = FeatureVector * Nodes HiddenLayer H1 = Activation_Function(P * X) HiddenLayer H2 = Activation_Function(P *H1) 21 Propagation P = weighted edges
  • 22. © 2020 TigerGraph. All Rights Reserved Graph Convolutional Network Input layer = FeatureMatrix X = FeatureVector * Nodes HiddenLayer H1 = Activation_Function(A*X*P), A = Adjacency Matrix (Edges) HiddenLayer H2 = Activation_Function(A*H1*P) 22 Propagation P = weighted edges
  • 23. © 2020. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | The TigerGraph Difference Feature Design Difference Benefit Real-Time Deep-Link Querying ● Native Graph design ● C++ engine, for high performance ● Storage Architecture ● Uncovers hard-to-find patterns ● Operational, real-time ● HTAP: Transactions+Analytics Handling Massive Scale ● Distributed DB architecture ● Massively parallel processing ● Compressed storage reduces footprint and messaging ● Integrates all your data ● Automatic partitioning ● Elastic scaling of resource usage In-Database Analytics ● GSQL: High-level yet Turing-complete language ● User-extensible graph algorithm library, runs in-DB ● ACID (OLTP) and Accumulators (OLAP) ● Avoids transferring data ● Richer graph context ● In-DB machine learning 5 to 10+ hops deep 23
  • 24. © 2020 TigerGraph. All Rights Reserved Summary for "Why Graph for ML/AI?" 24 ● Natural Data Model - Graph is how we think ● Richer data - connections between entities, graph-based features ● Graphs have always had a natural role in machine learning: ○ Unsupervised learning through graph algorithms, frequent pattern mining ○ Learning through neural networks and deep learning ● Graph data models are uniquely qualified to provide explanatory AI. ● Native Graphs with Massively Parallel Processing like TigerGraph enable large scale feature extraction and in-graph analytics
  • 25. © 2020. ALL RIGHTS RESERVED. | TIGERGRAPH.COM | CONFIDENTIAL INFORMATION | 25 Q & A Dr. Victor Lee Head of Product Strategy & Developer Relations victor@tigergraph.com Thank You