SlideShare a Scribd company logo
Graph Data Science Demo for
Fraud Analysis
Joe Depeau
Sr. Presales Consultant, UK
28th April, 2020
@joedepeau
http://linkedin.com/in/joedepeau
• Review of the Neo4j Graph Data Science Library
• Demo Data Overview
• Review of Graph Algorithms for Demo
• Demo
• Q&A
2
Agenda
Neo4j Graph Data Science
Library
3
4
Neo4j Graph Algorithms Overview
Pathfinding
& Search
Centrality /
Importance
Community
Detection
Link Prediction
Finds optimal paths
or evaluates route availability and
quality.
Determines the importance of
dis9nct nodes in the network.
Detects group clustering or
partition options.
Evaluates how alike nodes are
by neighbors and
relationships.
Estimates the likelihood of
nodes forming a future
relationship.
Similarity
Graph and ML algorithms in Neo4j
• Minimum Weight Spanning Tree
• Shortest Path
• Single Source Shortest Path
• All Pairs Shortest Path
• A*
• Yen’s K-shortest Paths
• Random Walk
• Breadth First Search
• Depth First Search
• Degree Centrality
• Closeness Centrality
• Betweenness Centrality
• PageRank
• ArticleRank
• Eigenvector Centrality
• Triangle Count / Clustering Coefficient
• Weakly Connected Components
• Strongly Connected Components
• Label PropagaJon
• Louvain Modularity
• K-1 Colouring
• Modularity OpJmisaJon
• Node Similarity
• Approximate Nearest Neighbours
• Cosine Similarity
• Euclidean Similarity
• Jaccard Similarity
• Overlap Similarity
• Pearson Similarity
Pathfinding
& Search
Centrality /
Importance
Community
Detec<on
Similarity
https://neo4j.com/docs/graph-data-science/1.0/
Link
Prediction
• Adamic Adar
• Common Neighbours
• PreferenJal ARachment
• Resource AllocaJons
• Same Community
• Total Neighbours
5
Demo Data Overview
6
7
Some Examples of Typical Bank Data
Event DataProduct and
Services Data
Customer DataOrganisational
Data
3rd Party Data
Documentation
Employee
Data
Processes
Systems and
Databases
KPIs and Reports
Address
Personal Data
Documents
Relationships
Assets
Documentation
Processes
Product / Service
Details
Product / Service
Hierarchy
Pricing
Money
Movements
Web / App Activity
Customer Contact
Social Media
Credit Rating
Agencies
Market Data
Organisational
Hierarchy
Corporate Data
8
Some Examples of Typical Bank Data
Event DataProduct and
Services Data
Customer DataOrganisational
Data
3rd Party Data
Documentation
Employee
Data
Processes
Systems and
Databases
KPIs and Reports
Address
Personal Data
Documents
Relationships
Assets
Documentation
Processes
Product / Service
Details
Product / Service
Hierarchy
Pricing
Money
Movements
Web / App Activity
Customer Contact
Social Media
Credit Rating
Agencies
Market Data
Organisational
Hierarchy
Corporate Data
9
Our Graph Model
Total Nodes : 3.4m
Total Relationships : 10.2m
10
Three ways a Client node can be Flagged
Performed a transaction flagged as fraud Share a SSN with another Client
Have more than one SSN on file
Graph Algorithms for
Demonstra>on
11
PageRank
What: Finds important nodes based
on their relationships.
Why: Identify important or
influential Client nodes by
quantifying the flows of money
towards them.
Uses:
- Fraud detection
- Anti-money Laundering
- Inform prioritization during analysis
and investigation12
13
The PageRank Algorithm
PageRank: what nodes can be considered
‘important’ in our graph based on money flows ?
14
The PageRank Algorithm
PageRank: what nodes can be considered
‘important’ in our graph based on money flows ?
Inputs
.pagerank
Property Output
Betweenness Centrality
15
What: The sum of the % shortest paths that
pass through a node, calculated by pairs.
Why: Identify well connected nodes within
the graph, or bridges between
communities.
Uses:
- Fraud detection
- Anti-money Laundering
- Inform prioritization during analysis and
investigation
16
The Betweenness Centrality Algorithm
Betweenness Centrality: what nodes can be
considered ‘important’ in our graph based on
how many ‘shortest paths’ they are present in?
17
The Betweenness Centrality Algorithm
Betweenness Centrality: what nodes can be
considered ‘important’ in our graph based on
how many ‘shortest paths’ they are present in?
Inputs
.centrality
Property Output
Weakly Connected Components
What: Finds disconnected
community subgraphs in our data.
Why: Identify communities based
on connections with shared pieces
of identity.
Uses:
- Householding
- Synthetic identities
- Stolen identities
18
19
The Weakly Connected Components Algorithm
Weakly Connected
Components: what
communities exist in the
data based on connections
to pieces of identity ?
20
The Weakly Connected Components Algorithm
Weakly Connected
Components: what
communiFes exist in the
data based on connecFons
to pieces of idenFty ?
.component_id
Property Output
Inputs
Louvain Modularity
What: Finds communi?es in our
graph who are connected. Can
return intermediate results.
Why: Useful for iden?fying
communi?es based on transac?on
behaviour rather than iden?ty.
Uses:
- Fraud ring detecHon
- AnH-money Laundering
21
22
The Louvain Algorithm
Louvain: what communiFes of nodes transact
amongst themselves ?
23
The Louvain Algorithm
Louvain: what communities of nodes transact
amongst themselves ?
Inputs
.louvain_community
Property Output
Node Similarity
What: Similarity between nodes
based on neighbours. Writes a new
relationship to the graph.
Why: Identify similar nodes who
share common pieces of identity.
Uses:
- Entity Resolution
- Synthetic identities
- Stolen identities
24
25
The Node Similarity Algorithm
Node Similarity : how
similar are two Client
nodes based on the non-
Merchant accounts they
transact with ?
26
The Node Similarity Algorithm
Node Similarity : how
similar are two Client
nodes based on the non-
Merchant accounts they
transact with ?
SIMILAR Relationship Output
with .score property
Inputs
Demo
27
28
Thank you!

More Related Content

PDF
Deep Learning Recommender Systems
PPTX
Collaborative filtering
PPTX
Topic sensitive page rank(review)
PDF
Calibrated Recommendations
PPTX
Social Network Analysis Introduction including Data Structure Graph overview.
PPTX
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
PDF
Matrix Factorization In Recommender Systems
PDF
Link Analysis
Deep Learning Recommender Systems
Collaborative filtering
Topic sensitive page rank(review)
Calibrated Recommendations
Social Network Analysis Introduction including Data Structure Graph overview.
Social Network Analysis: What It Is, Why We Should Care, and What We Can Lear...
Matrix Factorization In Recommender Systems
Link Analysis

What's hot (20)

PPT
Data Mining: Concepts and techniques: Chapter 11,Review: Basic Cluster Analys...
PDF
Knowledge Graph Embeddings for Recommender Systems
PDF
Tutorial on Deep Learning in Recommender System, Lars summer school 2019
PPT
Network forensics1
PDF
Community Detection in Social Media
PPTX
[Final]collaborative filtering and recommender systems
PPT
Web usage-mining
PDF
Preference Elicitation in Recommender Systems
PPTX
Collaborative Filtering Recommendation System
PDF
Social network analysis
PPTX
Data Mining: Graph mining and social network analysis
PPTX
Movie lens recommender systems
PPTX
Social Network Visualization 101
PDF
Learning to Rank - From pairwise approach to listwise
PDF
Artwork Personalization at Netflix
PDF
CS6010 Social Network Analysis Unit I
PDF
Matrix Factorization Techniques For Recommender Systems
PDF
Deep Learning for Recommender Systems RecSys2017 Tutorial
PDF
Introduction to Recommendation Systems
PDF
Incorporating Diversity in a Learning to Rank Recommender System
Data Mining: Concepts and techniques: Chapter 11,Review: Basic Cluster Analys...
Knowledge Graph Embeddings for Recommender Systems
Tutorial on Deep Learning in Recommender System, Lars summer school 2019
Network forensics1
Community Detection in Social Media
[Final]collaborative filtering and recommender systems
Web usage-mining
Preference Elicitation in Recommender Systems
Collaborative Filtering Recommendation System
Social network analysis
Data Mining: Graph mining and social network analysis
Movie lens recommender systems
Social Network Visualization 101
Learning to Rank - From pairwise approach to listwise
Artwork Personalization at Netflix
CS6010 Social Network Analysis Unit I
Matrix Factorization Techniques For Recommender Systems
Deep Learning for Recommender Systems RecSys2017 Tutorial
Introduction to Recommendation Systems
Incorporating Diversity in a Learning to Rank Recommender System
Ad

Similar to Graph Data Science DEMO for fraud analysis (20)

PDF
How Graph Algorithms Answer your Business Questions in Banking and Beyond
PDF
GraphTour London 2020 - Graphs for AI, Amy Hodler
PDF
Thwart Fraud Using Graph-Enhanced Machine Learning and AI
PDF
Big social data analytics - social network analysis
PDF
Graph Data Science Training - Alicia Frame Presentation
PDF
Detecting eCommerce Fraud with Neo4j and Linkurious
PDF
BIA 658 – Social Network Analysis - Final report Kanad Chatterjee
PDF
Improving Machine Learning using Graph Algorithms
PDF
Avoiding Anonymous Users in Multiple Social Media Networks (SMN)
PDF
AI, ML and Graph Algorithms: Real Life Use Cases with Neo4j
PDF
Graph Algorithms for Developers
PDF
Graphs for Finance - AML with Neo4j Graph Data Science
PDF
Leveraging Graphs for AI and ML - Alicia Frame, Neo4j
PDF
3. Relationships Matter: Using Connected Data for Better Machine Learning
PDF
SocialCom09-tutorial.pdf
PDF
Identical Users in Different Social Media Provides Uniform Network Structure ...
PDF
Social Friend Overlying Communities Based on Social Network Context
PPT
Web Mining .ppt
PPT
Web Mining .ppt
PDF
SMA-Unit-I: The Foundation for Analytics
How Graph Algorithms Answer your Business Questions in Banking and Beyond
GraphTour London 2020 - Graphs for AI, Amy Hodler
Thwart Fraud Using Graph-Enhanced Machine Learning and AI
Big social data analytics - social network analysis
Graph Data Science Training - Alicia Frame Presentation
Detecting eCommerce Fraud with Neo4j and Linkurious
BIA 658 – Social Network Analysis - Final report Kanad Chatterjee
Improving Machine Learning using Graph Algorithms
Avoiding Anonymous Users in Multiple Social Media Networks (SMN)
AI, ML and Graph Algorithms: Real Life Use Cases with Neo4j
Graph Algorithms for Developers
Graphs for Finance - AML with Neo4j Graph Data Science
Leveraging Graphs for AI and ML - Alicia Frame, Neo4j
3. Relationships Matter: Using Connected Data for Better Machine Learning
SocialCom09-tutorial.pdf
Identical Users in Different Social Media Provides Uniform Network Structure ...
Social Friend Overlying Communities Based on Social Network Context
Web Mining .ppt
Web Mining .ppt
SMA-Unit-I: The Foundation for Analytics
Ad

More from Neo4j (20)

PDF
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
PDF
Jin Foo - Prospa GraphSummit Sydney Presentation.pdf
PDF
GraphSummit Singapore Master Deck - May 20, 2025
PPTX
Graphs & GraphRAG - Essential Ingredients for GenAI
PPTX
Neo4j Knowledge for Customer Experience.pptx
PPTX
GraphTalk New Zealand - The Art of The Possible.pptx
PDF
Neo4j: The Art of the Possible with Graph
PDF
Smarter Knowledge Graphs For Public Sector
PDF
GraphRAG and Knowledge Graphs Exploring AI's Future
PDF
Matinée GenAI & GraphRAG Paris - Décembre 24
PDF
ANZ Presentation: GraphSummit Melbourne 2024
PDF
Google Cloud Presentation GraphSummit Melbourne 2024: Building Generative AI ...
PDF
Telstra Presentation GraphSummit Melbourne: Optimising Business Outcomes with...
PDF
Hands-On GraphRAG Workshop: GraphSummit Melbourne 2024
PDF
Démonstration Digital Twin Building Wire Management
PDF
Swiss Life - Les graphes au service de la détection de fraude dans le domaine...
PDF
Démonstration Supply Chain - GraphTalk Paris
PDF
The Art of Possible - GraphTalk Paris Opening Session
PPTX
How Siemens bolstered supply chain resilience with graph-powered AI insights ...
PDF
Knowledge Graphs for AI-Ready Data and Enterprise Deployment - Gartner IT Sym...
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
Jin Foo - Prospa GraphSummit Sydney Presentation.pdf
GraphSummit Singapore Master Deck - May 20, 2025
Graphs & GraphRAG - Essential Ingredients for GenAI
Neo4j Knowledge for Customer Experience.pptx
GraphTalk New Zealand - The Art of The Possible.pptx
Neo4j: The Art of the Possible with Graph
Smarter Knowledge Graphs For Public Sector
GraphRAG and Knowledge Graphs Exploring AI's Future
Matinée GenAI & GraphRAG Paris - Décembre 24
ANZ Presentation: GraphSummit Melbourne 2024
Google Cloud Presentation GraphSummit Melbourne 2024: Building Generative AI ...
Telstra Presentation GraphSummit Melbourne: Optimising Business Outcomes with...
Hands-On GraphRAG Workshop: GraphSummit Melbourne 2024
Démonstration Digital Twin Building Wire Management
Swiss Life - Les graphes au service de la détection de fraude dans le domaine...
Démonstration Supply Chain - GraphTalk Paris
The Art of Possible - GraphTalk Paris Opening Session
How Siemens bolstered supply chain resilience with graph-powered AI insights ...
Knowledge Graphs for AI-Ready Data and Enterprise Deployment - Gartner IT Sym...

Recently uploaded (20)

PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Approach and Philosophy of On baking technology
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Network Security Unit 5.pdf for BCA BBA.
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PPTX
sap open course for s4hana steps from ECC to s4
PDF
Unlocking AI with Model Context Protocol (MCP)
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
KodekX | Application Modernization Development
PDF
MIND Revenue Release Quarter 2 2025 Press Release
DOCX
The AUB Centre for AI in Media Proposal.docx
Diabetes mellitus diagnosis method based random forest with bat algorithm
Per capita expenditure prediction using model stacking based on satellite ima...
MYSQL Presentation for SQL database connectivity
Approach and Philosophy of On baking technology
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Digital-Transformation-Roadmap-for-Companies.pptx
Network Security Unit 5.pdf for BCA BBA.
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Mobile App Security Testing_ A Comprehensive Guide.pdf
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Chapter 3 Spatial Domain Image Processing.pdf
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
sap open course for s4hana steps from ECC to s4
Unlocking AI with Model Context Protocol (MCP)
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
KodekX | Application Modernization Development
MIND Revenue Release Quarter 2 2025 Press Release
The AUB Centre for AI in Media Proposal.docx

Graph Data Science DEMO for fraud analysis

  • 1. Graph Data Science Demo for Fraud Analysis Joe Depeau Sr. Presales Consultant, UK 28th April, 2020 @joedepeau http://linkedin.com/in/joedepeau
  • 2. • Review of the Neo4j Graph Data Science Library • Demo Data Overview • Review of Graph Algorithms for Demo • Demo • Q&A 2 Agenda
  • 3. Neo4j Graph Data Science Library 3
  • 4. 4 Neo4j Graph Algorithms Overview Pathfinding & Search Centrality / Importance Community Detection Link Prediction Finds optimal paths or evaluates route availability and quality. Determines the importance of dis9nct nodes in the network. Detects group clustering or partition options. Evaluates how alike nodes are by neighbors and relationships. Estimates the likelihood of nodes forming a future relationship. Similarity
  • 5. Graph and ML algorithms in Neo4j • Minimum Weight Spanning Tree • Shortest Path • Single Source Shortest Path • All Pairs Shortest Path • A* • Yen’s K-shortest Paths • Random Walk • Breadth First Search • Depth First Search • Degree Centrality • Closeness Centrality • Betweenness Centrality • PageRank • ArticleRank • Eigenvector Centrality • Triangle Count / Clustering Coefficient • Weakly Connected Components • Strongly Connected Components • Label PropagaJon • Louvain Modularity • K-1 Colouring • Modularity OpJmisaJon • Node Similarity • Approximate Nearest Neighbours • Cosine Similarity • Euclidean Similarity • Jaccard Similarity • Overlap Similarity • Pearson Similarity Pathfinding & Search Centrality / Importance Community Detec<on Similarity https://neo4j.com/docs/graph-data-science/1.0/ Link Prediction • Adamic Adar • Common Neighbours • PreferenJal ARachment • Resource AllocaJons • Same Community • Total Neighbours 5
  • 7. 7 Some Examples of Typical Bank Data Event DataProduct and Services Data Customer DataOrganisational Data 3rd Party Data Documentation Employee Data Processes Systems and Databases KPIs and Reports Address Personal Data Documents Relationships Assets Documentation Processes Product / Service Details Product / Service Hierarchy Pricing Money Movements Web / App Activity Customer Contact Social Media Credit Rating Agencies Market Data Organisational Hierarchy Corporate Data
  • 8. 8 Some Examples of Typical Bank Data Event DataProduct and Services Data Customer DataOrganisational Data 3rd Party Data Documentation Employee Data Processes Systems and Databases KPIs and Reports Address Personal Data Documents Relationships Assets Documentation Processes Product / Service Details Product / Service Hierarchy Pricing Money Movements Web / App Activity Customer Contact Social Media Credit Rating Agencies Market Data Organisational Hierarchy Corporate Data
  • 9. 9 Our Graph Model Total Nodes : 3.4m Total Relationships : 10.2m
  • 10. 10 Three ways a Client node can be Flagged Performed a transaction flagged as fraud Share a SSN with another Client Have more than one SSN on file
  • 12. PageRank What: Finds important nodes based on their relationships. Why: Identify important or influential Client nodes by quantifying the flows of money towards them. Uses: - Fraud detection - Anti-money Laundering - Inform prioritization during analysis and investigation12
  • 13. 13 The PageRank Algorithm PageRank: what nodes can be considered ‘important’ in our graph based on money flows ?
  • 14. 14 The PageRank Algorithm PageRank: what nodes can be considered ‘important’ in our graph based on money flows ? Inputs .pagerank Property Output
  • 15. Betweenness Centrality 15 What: The sum of the % shortest paths that pass through a node, calculated by pairs. Why: Identify well connected nodes within the graph, or bridges between communities. Uses: - Fraud detection - Anti-money Laundering - Inform prioritization during analysis and investigation
  • 16. 16 The Betweenness Centrality Algorithm Betweenness Centrality: what nodes can be considered ‘important’ in our graph based on how many ‘shortest paths’ they are present in?
  • 17. 17 The Betweenness Centrality Algorithm Betweenness Centrality: what nodes can be considered ‘important’ in our graph based on how many ‘shortest paths’ they are present in? Inputs .centrality Property Output
  • 18. Weakly Connected Components What: Finds disconnected community subgraphs in our data. Why: Identify communities based on connections with shared pieces of identity. Uses: - Householding - Synthetic identities - Stolen identities 18
  • 19. 19 The Weakly Connected Components Algorithm Weakly Connected Components: what communities exist in the data based on connections to pieces of identity ?
  • 20. 20 The Weakly Connected Components Algorithm Weakly Connected Components: what communiFes exist in the data based on connecFons to pieces of idenFty ? .component_id Property Output Inputs
  • 21. Louvain Modularity What: Finds communi?es in our graph who are connected. Can return intermediate results. Why: Useful for iden?fying communi?es based on transac?on behaviour rather than iden?ty. Uses: - Fraud ring detecHon - AnH-money Laundering 21
  • 22. 22 The Louvain Algorithm Louvain: what communiFes of nodes transact amongst themselves ?
  • 23. 23 The Louvain Algorithm Louvain: what communities of nodes transact amongst themselves ? Inputs .louvain_community Property Output
  • 24. Node Similarity What: Similarity between nodes based on neighbours. Writes a new relationship to the graph. Why: Identify similar nodes who share common pieces of identity. Uses: - Entity Resolution - Synthetic identities - Stolen identities 24
  • 25. 25 The Node Similarity Algorithm Node Similarity : how similar are two Client nodes based on the non- Merchant accounts they transact with ?
  • 26. 26 The Node Similarity Algorithm Node Similarity : how similar are two Client nodes based on the non- Merchant accounts they transact with ? SIMILAR Relationship Output with .score property Inputs