SlideShare a Scribd company logo
Graphs for AI and ML
Dr. Jim Webber
Chief Scientist, Neo4j
@jimwebber
● Some definitions
● Accidental Skynet
● Graph theory
● Contemporary graph ML
● The future of graph AI
Overview
● ML - Machine Learning
○ Finding functions from historical data to guide future
interactions within a given domain
● AI - Artificial Intelligence
● The property of a system that it appears intelligent to its users
● Often, but not always, using ML techniques
● Or ML implementations that can be cheaply retrained to address
neighbouring domains
A Bluffer’s Guide to AI-cronyms
● Predictive analytics
● Use past data to predict the future
● General purpose AI
● ML with transfer learning such that learned experiences in one
domain can be applied elsewhere
● Human-like AI
Often conflated with
ML all the things
Where are we today?
Graphs for Ai and ML
Extract all the features!
• What do we do? Turn it to
vectors and pump it through a
classification or regression
model
• That’s actually not a bad
thing
• But we can do so much before
we even get to ML…
• … if we have graph data
Credit: Graph Algorithms, Holder and Needham, O’Reilly 2019
http://www.bbc.co.uk/london/travel/downloads/tube_map.html
• Nodes with optional properties and optional labels
• Named, directed relationships with optional properties
• Relationships have exactly one start and end node
• Which may be the same node
Labeled Property graph model
Fearless querying
MATCH path = (:author {name:’Jim Webber’}
-[*]->(:character {name:’The Doctor’})
RETURN path
OR
MATCH (me:author {name:’Jim Webber’},
(doc:character {name:’The Doctor’}),
path = shortestPath((me)-[*]->(doc))
RETURN path
Take a step back
We can be smarter about this
Graphs for Ai and ML
Realtime Predictive Analytics
(circa 2008)
+ +
=
Graphs for Ai and ML
Not AI, but extremely effective
Credit: https://medium.com/basecs/breaking-down-breadth-first-search-cebe696709d9
Credit:
https://www.networkworld.com/article/3211410
/lan-wan/the-10-most-powerful-companies-in-
enterprise-networking.html
Toolkit matures into
proper database
• Cypher and Neo4j server make
real time graph analytical
patterns simple to apply
• Amazing and humane to
implement
Firstname:
Mickey
Surname: Smith
DoB: 19781006
SKU: 5e175641
Product:
Badgers
Nadgers Ale
SKU: 2555f258
Product:
Peewee Pilsner
Category: beer
SKU: 49d102bc
Product: Baby
Dry Nights
Category:
nappies
Category: baby Category:
alcoholic
drinks
SKU: 49d102bc
Product: XBox
360
Category:
consumer
electronics
Category:
console
BOUGHTBOUGHT
MEMBER_OF
MEMBER_OFMEMBER_OF
MEMBER_OFMEMBER_OF
Graphs for Ai and ML
Firstname: *
Surname: *
DoB: 1996 > x
> 1972
Category: beerCategory:
nappies
BOUGHTCategory: game
console
Young fathers pattern
Graphs for Ai and ML
Firstname: *
Surname: *
DoB: 1996 > x
> 1972
Category: beerCategory:
nappies
!BOUGHTCategory: game
console
Business opportunity
(beer)(nappies)
(console)
(daddy)
() ()
()
(d)-[:BOUGHT]->()-[:MEMBER_OF]->(n)
(d)-[:BOUGHT]->()-[:MEMBER_OF]->(b)
(d)-[:BOUGHT]->()-[:MEMBER_OF]->(c)
Flatten the graph
(d:Person)-[:BOUGHT]->()-[:MEMBER_OF]->(n:Category)
(d:Person)-[:BOUGHT]->()-[:MEMBER_OF]->(b:Category)
(d:Person)-[:BOUGHT]->()-[:MEMBER_OF]->(c:Category)
Include any labels
MATCH (d:Person)-[:BOUGHT]->()-[:MEMBER_OF]->(n:Category),
(d:Person)-[:BOUGHT]->()-[:MEMBER_OF]->(b:Category)
Add a MATCH clause
MATCH (d:Person)-[:BOUGHT]->()-[:MEMBER_OF]->(n:Category),
(d:Person)-[:BOUGHT]->()-[:MEMBER_OF]->(b:Category),
(c:Category)
WHERE NOT((d)-[:BOUGHT]->()-[:MEMBER_OF]->(c))
Constrain the Pattern
MATCH (d:Person)-[:BOUGHT]->()-[:MEMBER_OF]->(n:Category),
(d:Person)-[:BOUGHT]->()-[:MEMBER_OF]->(b:Category),
(c:Category)
WHERE n.category = "nappies" AND
b.category = "beer" AND
c.category = "console" AND
NOT((d)-[:BOUGHT]->()-[:MEMBER_OF]->(c))
Add property constraints
MATCH (d:Person)-[:BOUGHT]->()-[:MEMBER_OF]->(n:Category),
(d:Person)-[:BOUGHT]->()-[:MEMBER_OF]->(b:Category),
(c:Category)
WHERE n.category = "nappies" AND
b.category = "beer" AND
c.category = "console" AND
NOT((d)-[:BOUGHT]->()-[:MEMBER_OF]->(c))
RETURN DISTINCT d AS daddy
Profit!
==> +---------------------------------------------+
==> | daddy |
==> +---------------------------------------------+
==> | Node[15]{name:"Rory Williams",dob:19880121} |
==> +---------------------------------------------+
==> 1 row
==> 0 ms
==>
neo4j-sh (0)$
Results
Graphs for Ai and ML
Which sushi restaurants
in NYC do my friends
like?
Facebook Graph Search
See http://maxdemarzi.com/
Graph Structure
Simple Query, Intelligent Results
MATCH (:Person {name: 'Jim'})
-[:IS_FRIEND_OF]->(:Person)
-[:LIKES]->(restaurant:Restaurant)
-[:LOCATED_IN]->(:Place {location: 'New York'}),
(restaurant)-[:SERVES]->(:Cuisine {cuisine: 'Sushi'})
RETURN restaurant
Search structure
Graph Theory
• Rich knowledge of how graphs
operate in many domains
• Off the shelf algorithms to
process those graphs for
information, insight, predictions
• Low barrier to entry
• Amazingly powerful
Graphs for Ai and ML
Triadic Closure
name: Kyle
name: Stan name: Kenny
Triadic Closure
name: Kyle
name: Stan name: Kenny
name: Kyle
name: Stan name: Kenny
FRIEND
Structural Balance
name:
Cartman
name: Craig name: Tweek
Structural Balance
name:
Cartman
name: Craig name: Tweek
name:
Cartman
name: Craig name: Tweek
FRIEND
Structural Balance
name:
Cartman
name: Craig name: Tweek
name:
Cartman
name: Craig name: Tweek
ENEMY
Graphs for Ai and ML
Structural Balance
name: Kyle
name: Stan name: Kenny
name: Kyle
name: Stan name: Kenny
FRIEND
Graphs for Ai and ML
Structural Balance is a key
predictive technique
And it’s domain-agnostic
Allies and Enemies
UK
GermanyFrance
Russia Italy
Austria
Allies and Enemies
UK
GermanyFrance
Russia Italy
Austria
Allies and Enemies
UK
GermanyFrance
Russia Italy
Austria
Allies and Enemies
UK
GermanyFrance
Russia Italy
Austria
Allies and Enemies
UK
GermanyFrance
Russia Italy
Austria
Allies and Enemies
UK
GermanyFrance
Russia Italy
Austria
Predicting WWI
[Easley and Kleinberg]
Graphs for Ai and ML
It if a node has strong relationships to two neighbours, then these
neighbours must have at least a weak relationship between them.
[Wikipedia]
Strong Triadic Closure
Triadic Closure
(weak relationship)
name: Kenny
name: Stan name: Cartman
Triadic Closure
(weak relationship)
name: Kenny
name: Stan name: Cartman
name: Kenny
name: Stan name: Cartman
FRIEND 50%
• Relationships can have “strength” as well as intent
• Think: weighting on a relationship in a property graph
• Weak links play another super-important structural role in graph
theory
• They bridge neighbourhoods
Weak relationships
Local Bridges
FRIEND
name: Kenny
name: Stanname: Kyle
FRIEND
FRIEND
name: Sally
name: Bebename: Wendy
FRIEND
FRIEND 50%
name:
Cartman
FRIEND
ENEMY
“If a node A in a network satisfies the Strong Triadic Closure Property
and is involved in at least two strong relationships, then any local
bridge it is involved in must be a weak relationship.”
[Easley and Kleinberg]
Local Bridge Property
University Karate Club
• (NP) Hard problem
• Repeatedly remove the spanning links between dense regions
• Or recursively merge nodes into ever larger “subgraph” nodes
• Choose your algorithm carefully – some are better than others for
a given domain
• Can use to (almost exactly) predict the
break up of the karate club!
Graph Partitioning
University Karate Clubs
(predicted by Graph Theory)
9
University Karate Clubs
(what actually happened!)
• Label Propagation
• Union Find / Weakly Connected Components
• Strongly Connected Components
• Triangle-Count / Clustering Coefficient
ClusteringCentrality
• PageRank
• Betweenness
• Closeness
• Degree
Path Finding
• Breadth-first search
• Depth-first search
• Single-source shortest path
• All-pairs shortest path
• Minimum weight spanning
tree
Graph Algorithms in Neo4j
Graphs for Ai and ML
Amazing Native Graph Performance
Credit: https://reezocar.blob.core.windows.net/blog/2015/09/k2000.jpg
Find and stop spammers
Extract graph structure over time
Not message content!
(Fakhraei et al, KDD 2015)
Learning to stop bad guys
Result: find and classify 70% spammers with 90% accuracy
Much of modern graph ML is still about turning graphs to vectors
Graph2Vec and friends
Highly complementary techniques
Mixing structural data and features gives better results
Better data into the model, better results out
But we don’t have to always vectorize graphs...
Graph ML
Knowledge Graphs
• Semantic domain knowledge for
inference and understanding
• E.g. eBay Google Assistant
• What’s the next best question to ask
when a potential customer says they
want a bag?
• Price? Function? Colour?
• Depends on context! Demographic,
history, user journey.
• Richly connected data makes the
system seem intelligent
• But it’s “just” data and algorithms in
reality
Graph Convolutional
Neural Networks
A general architecture for
predicting node and relationship
attributes in graphs.
(Kipf and Welling, ICLR 2017)
Credit: Andrew Docherty (CSIRO), YowData 2017
https://www.youtube.com/watch?v=Gmxz41L70Fg
Graph Networks for
Structured Causal Models
• Position paper from Google,
MIT, Edinburgh
• Structured representations and
computations (graphs) are key
• Goal: generalize beyond direct
experience
• Like human infants can
https://arxiv.org/pdf/1806.01261.pdf
credit: @markhneedham
Graphs for Ai and ML
Thanks for listening
Ask the experts session tomorrow 14:30
@jimwebber

More Related Content

PDF
Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...
PDF
Data Science Full Course | Edureka
PPTX
Introduction to Data Visualization
PPTX
Predictive analytics
PPTX
Data Science Training | Data Science For Beginners | Data Science With Python...
PPTX
Introduction of data science
PDF
Data Analyst Roles & Responsibilities | Edureka
PPTX
Machine learning libraries with python
Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...
Data Science Full Course | Edureka
Introduction to Data Visualization
Predictive analytics
Data Science Training | Data Science For Beginners | Data Science With Python...
Introduction of data science
Data Analyst Roles & Responsibilities | Edureka
Machine learning libraries with python

What's hot (20)

PPTX
Introduction to Data mining
PPTX
Statistics for data science
PPTX
Sentiment analysis using ml
PDF
Data Science Training | Data Science Tutorial for Beginners | Data Science wi...
PDF
Business Data Analytics Powerpoint Presentation Slides
PDF
Introduction to Data Science
PDF
Data Scientist Roles and Responsibilities | Data Scientist Career | Data Scie...
PPTX
Data visualization
PDF
What’s New with Databricks Machine Learning
PDF
https://www.slideshare.net/neo4j/a-fusion-of-machine-learning-and-graph-analy...
PDF
Visualisation & Storytelling in Data Science & Analytics
PPTX
Introduction to Data Analytics
PDF
Architecture of a search engine
PDF
Introduction to ETL and Data Integration
PDF
Data lineage and observability with Marquez - subsurface 2020
PPTX
Data science & data scientist
PPTX
Application of predictive analytics
PPTX
Tableau ppt
PPTX
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...
PDF
Productionzing ML Model Using MLflow Model Serving
Introduction to Data mining
Statistics for data science
Sentiment analysis using ml
Data Science Training | Data Science Tutorial for Beginners | Data Science wi...
Business Data Analytics Powerpoint Presentation Slides
Introduction to Data Science
Data Scientist Roles and Responsibilities | Data Scientist Career | Data Scie...
Data visualization
What’s New with Databricks Machine Learning
https://www.slideshare.net/neo4j/a-fusion-of-machine-learning-and-graph-analy...
Visualisation & Storytelling in Data Science & Analytics
Introduction to Data Analytics
Architecture of a search engine
Introduction to ETL and Data Integration
Data lineage and observability with Marquez - subsurface 2020
Data science & data scientist
Application of predictive analytics
Tableau ppt
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...
Productionzing ML Model Using MLflow Model Serving
Ad

Similar to Graphs for Ai and ML (20)

PPTX
GraphTour Boston - Graphs for AI and ML
PPTX
A Little Graph Theory for the Busy Developer - Jim Webber @ GraphConnect Chic...
PPTX
Graphs for AI & ML, Jim Webber, Neo4j
PDF
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...
PDF
Mastering Customer Data on Apache Spark
PDF
Relationships Matter: Using Connected Data for Better Machine Learning
PDF
How Graph Databases used in Police Department?
PDF
Workshop - Neo4j Graph Data Science
PDF
What Is GDS and Neo4j’s GDS Library
PDF
3. Relationships Matter: Using Connected Data for Better Machine Learning
PDF
Graph Data Science with Neo4j: Nordics Webinar
PDF
Improve ml predictions using graph algorithms (webinar july 23_19).pptx
PDF
Main principles of Data Science and Machine Learning
PDF
Graphs for Data Science and Machine Learning
PDF
DutchMLSchool. Logistic Regression, Deepnets, Time Series
PPTX
The Semantic Knowledge Graph
PDF
Einstieg in Neo4j Graph Data Science
PPTX
Using Connected Data and Graph Technology to Enhance Machine Learning and Art...
PDF
Improve ML Predictions using Graph Analytics (today!)
PDF
Neo4j in Depth
GraphTour Boston - Graphs for AI and ML
A Little Graph Theory for the Busy Developer - Jim Webber @ GraphConnect Chic...
Graphs for AI & ML, Jim Webber, Neo4j
All in AI: LLM Landscape & RAG in 2024 with Mark Ryan (Google) & Jerry Liu (L...
Mastering Customer Data on Apache Spark
Relationships Matter: Using Connected Data for Better Machine Learning
How Graph Databases used in Police Department?
Workshop - Neo4j Graph Data Science
What Is GDS and Neo4j’s GDS Library
3. Relationships Matter: Using Connected Data for Better Machine Learning
Graph Data Science with Neo4j: Nordics Webinar
Improve ml predictions using graph algorithms (webinar july 23_19).pptx
Main principles of Data Science and Machine Learning
Graphs for Data Science and Machine Learning
DutchMLSchool. Logistic Regression, Deepnets, Time Series
The Semantic Knowledge Graph
Einstieg in Neo4j Graph Data Science
Using Connected Data and Graph Technology to Enhance Machine Learning and Art...
Improve ML Predictions using Graph Analytics (today!)
Neo4j in Depth
Ad

More from Neo4j (20)

PDF
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
PDF
Jin Foo - Prospa GraphSummit Sydney Presentation.pdf
PDF
GraphSummit Singapore Master Deck - May 20, 2025
PPTX
Graphs & GraphRAG - Essential Ingredients for GenAI
PPTX
Neo4j Knowledge for Customer Experience.pptx
PPTX
GraphTalk New Zealand - The Art of The Possible.pptx
PDF
Neo4j: The Art of the Possible with Graph
PDF
Smarter Knowledge Graphs For Public Sector
PDF
GraphRAG and Knowledge Graphs Exploring AI's Future
PDF
Matinée GenAI & GraphRAG Paris - Décembre 24
PDF
ANZ Presentation: GraphSummit Melbourne 2024
PDF
Google Cloud Presentation GraphSummit Melbourne 2024: Building Generative AI ...
PDF
Telstra Presentation GraphSummit Melbourne: Optimising Business Outcomes with...
PDF
Hands-On GraphRAG Workshop: GraphSummit Melbourne 2024
PDF
Démonstration Digital Twin Building Wire Management
PDF
Swiss Life - Les graphes au service de la détection de fraude dans le domaine...
PDF
Démonstration Supply Chain - GraphTalk Paris
PDF
The Art of Possible - GraphTalk Paris Opening Session
PPTX
How Siemens bolstered supply chain resilience with graph-powered AI insights ...
PDF
Knowledge Graphs for AI-Ready Data and Enterprise Deployment - Gartner IT Sym...
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
Jin Foo - Prospa GraphSummit Sydney Presentation.pdf
GraphSummit Singapore Master Deck - May 20, 2025
Graphs & GraphRAG - Essential Ingredients for GenAI
Neo4j Knowledge for Customer Experience.pptx
GraphTalk New Zealand - The Art of The Possible.pptx
Neo4j: The Art of the Possible with Graph
Smarter Knowledge Graphs For Public Sector
GraphRAG and Knowledge Graphs Exploring AI's Future
Matinée GenAI & GraphRAG Paris - Décembre 24
ANZ Presentation: GraphSummit Melbourne 2024
Google Cloud Presentation GraphSummit Melbourne 2024: Building Generative AI ...
Telstra Presentation GraphSummit Melbourne: Optimising Business Outcomes with...
Hands-On GraphRAG Workshop: GraphSummit Melbourne 2024
Démonstration Digital Twin Building Wire Management
Swiss Life - Les graphes au service de la détection de fraude dans le domaine...
Démonstration Supply Chain - GraphTalk Paris
The Art of Possible - GraphTalk Paris Opening Session
How Siemens bolstered supply chain resilience with graph-powered AI insights ...
Knowledge Graphs for AI-Ready Data and Enterprise Deployment - Gartner IT Sym...

Recently uploaded (20)

PPTX
Computer Software and OS of computer science of grade 11.pptx
PDF
How to Choose the Right IT Partner for Your Business in Malaysia
PDF
Nekopoi APK 2025 free lastest update
PPTX
ai tools demonstartion for schools and inter college
PDF
Design an Analysis of Algorithms II-SECS-1021-03
PDF
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
PDF
System and Network Administraation Chapter 3
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
PPTX
Embracing Complexity in Serverless! GOTO Serverless Bengaluru
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 41
PPTX
L1 - Introduction to python Backend.pptx
PDF
EN-Survey-Report-SAP-LeanIX-EA-Insights-2025.pdf
PDF
PTS Company Brochure 2025 (1).pdf.......
PDF
How to Migrate SBCGlobal Email to Yahoo Easily
PPTX
CHAPTER 2 - PM Management and IT Context
PDF
Softaken Excel to vCard Converter Software.pdf
PDF
medical staffing services at VALiNTRY
PDF
Understanding Forklifts - TECH EHS Solution
PPTX
Reimagine Home Health with the Power of Agentic AI​
PDF
Digital Systems & Binary Numbers (comprehensive )
Computer Software and OS of computer science of grade 11.pptx
How to Choose the Right IT Partner for Your Business in Malaysia
Nekopoi APK 2025 free lastest update
ai tools demonstartion for schools and inter college
Design an Analysis of Algorithms II-SECS-1021-03
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
System and Network Administraation Chapter 3
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
Embracing Complexity in Serverless! GOTO Serverless Bengaluru
Internet Downloader Manager (IDM) Crack 6.42 Build 41
L1 - Introduction to python Backend.pptx
EN-Survey-Report-SAP-LeanIX-EA-Insights-2025.pdf
PTS Company Brochure 2025 (1).pdf.......
How to Migrate SBCGlobal Email to Yahoo Easily
CHAPTER 2 - PM Management and IT Context
Softaken Excel to vCard Converter Software.pdf
medical staffing services at VALiNTRY
Understanding Forklifts - TECH EHS Solution
Reimagine Home Health with the Power of Agentic AI​
Digital Systems & Binary Numbers (comprehensive )

Graphs for Ai and ML

Editor's Notes

  • #2: Focus: is on graph analytics in this talk.
  • #4: ML - this is what nerds do. Sometimes ML is so compelling that it seems intelligent, but in reality it’s data and algorithms AI - train a system to classify animals, might also work on shoes. See: hot dog; not hot dog! GP-AI - systems like AlphaGo might be an architecture to support this in future, but we’re not there today
  • #5: GP-AI - systems like AlphaGo might be an architecture to support this in future, but we’re not there today
  • #7: Here’s where we are mostly today. Row-oriented data. Maybe some documents, maybe some columns, but mostly rows of data from arcane data models.
  • #10: You already know graphs
  • #11: People talk about Codd’s relational model being mature because it was proposed in 1969 – 49 years old. Euler’s graph theory was proposed in 1736 – 282 years old. Now we use the labelled property graph model. A very simple set of idioms that can build very sophisticated models.
  • #12: Graphs are the most natural way to model most domains. You already know this because you draw graphs on a whiteboard, but you’ve never had the opportunity to take that down into the database before. Nodes are a bit like documents, but they’re flat at present in Neo4j. You pour data into your nodes and then connect them – easy peasy. This enables high fidelity domain modeling because this is how your domains work. And you don’t have to do this stuff in your application code – it’s right there in the database Let’s prove it by exploring a fun domain…
  • #13: Graphs are the most natural way to model most domains. You already know this because you draw graphs on a whiteboard, but you’ve never had the opportunity to take that down into the database before. Nodes are a bit like documents, but they’re flat at present in Neo4j. You pour data into your nodes and then connect them – easy peasy. This enables high fidelity domain modeling because this is how your domains work. And you don’t have to do this stuff in your application code – it’s right there in the database Let’s prove it by exploring a fun domain…
  • #14: If you want to know who followed Matt Smith, easy! Traversing the regenerated (or any) relationship takes about 1/40 millionth of a second on this mac in a steady state database
  • #15: What if you want to know who preceded Matt Smith? Easy. Traverse the regenerated rels in the other way. Cost? About 1/40 millionth of a second on this laptop in a steady state database.
  • #16: Joins are super cheap for good graph DBs On my laptop, I can get to 40M traversals/sec in a steady state DB You can explore a lot of data very quickly Which makes it a good fit for data intensive applications like ML
  • #17: My shortest path to Doctor Who?
  • #18: But before we get to ML, let’s take a step back into my history building smart systems
  • #19: All the way back to Autumn 2008
  • #20: November 2007 met Emil at Øredev in Malmö Sweden Java and Maven build-your-own-DBMS toolkit called Neo4j Java Core API only Long afternoon of loading data and writing a recommendation query...
  • #22: Find the current customer Find things they own Find things that depend on the things they own Sell Repeat All we did at first was understand the dependencies between products and bundles. We never tried to upsell something incompatible. Never tried to sell them something they already owned. Never undersold them. And it opened a world of possibilities to combine other graphs: demographic, social, geographical, municipal, network... The system made intelligent suggestions, but it was not ML or AI, just graph queries. It was good.
  • #23: Unexpectedly Powerful Solved a problem in a long afternoon was meant to take years with off-the-shelf software Applied same pattern to PoS retail recommendations, fraud detection… in subsequent months Still amazed! Effect: join Neo4j as Chief Scientist in 2010. So let’s get into graphs.
  • #24: Realtime retail recommendations. Historical anecdote about beer and nappies.
  • #25: Large UK retailer We had a data model Some of it taxonomical Some of it stock-centric. Some transactional
  • #26: START n=node(*) MATCH n-[r?]->() DELETE n,r CREATE (daddy1:Person { name: 'Mickey Smith', dob: 19781006 }) CREATE (alcohol:Category { category : 'alcoholic drinks'}) CREATE (beer:Category { category : 'beer'}) CREATE beer-[:MEMBER_OF]->alcohol CREATE (peeweePilsner:Product { sku: '2555f258', product: 'Peewee Pilsner' }) CREATE (badgersNadgers:Product { sku: '5e175641', product: 'Badgers Nadgers Ale' }) CREATE peeweePilsner-[:MEMBER_OF]->beer CREATE badgersNadgers-[:MEMBER_OF]->beer CREATE daddy1-[:BOUGHT]->peeweePilsner CREATE daddy1-[:BOUGHT]->badgersNadgers CREATE (baby:Category { category: 'baby' }) CREATE (nappies:Category { category: 'nappies' }) CREATE nappies-[:MEMBER_OF]->baby CREATE (babyDryNights:Product { sku: '49d102bc', product: 'Baby Dry Nights'}) CREATE babyDryNights-[:MEMBER_OF]->nappies CREATE daddy1-[:BOUGHT]->babyDryNights CREATE (consumerElectronics:Category { category: 'consumer electronics' }) CREATE (console:Category { category: 'console' }) CREATE (xbox:Product { sku: '49d102bc', product: 'XBox 360' }) CREATE xbox-[:MEMBER_OF]->(console)-[:MEMBER_OF]->consumerElectronics CREATE daddy1-[:BOUGHT]->xbox CREATE (mummy1:Person { name: 'Rose Tyler', dob: 19800317 }) CREATE (wine:Product { sku:'3a3f22bc', product: 'Shiraz' }) CREATE wine-[:MEMBER_OF]->alcohol CREATE mummy1-[:BOUGHT]->wine CREATE mummy1-[:BOUGHT]->babyDryNights CREATE (daddy2:Person { name: 'Rory Williams', dob: 19880121 }) CREATE daddy2-[:BOUGHT]->peeweePilsner CREATE daddy2-[:BOUGHT]->babyDryNights // Cypher 1.0 query START beer=node(2), nappies=node(7), xbox=node(11) MATCH (daddy)-[:BOUGHT]->()-[:MEMBER_OF]->(beer), (daddy)-[:BOUGHT]->()-[:MEMBER_OF]->(nappies), (daddy)-[b?:BOUGHT]->(xbox) WHERE b is null RETURN distinct daddy // Cypher 2.0 query MATCH (d:Person)-[:BOUGHT]->()-[:MEMBER_OF]->(n:Category), (d:Person)-[:BOUGHT]->()-[:MEMBER_OF]->(b:Category), (c:Category) WHERE n.category = "nappies" AND b.category = "beer" AND c.category = "console" AND NOT((d)-[:BOUGHT]->()-[:MEMBER_OF]->(c)) RETURN DISTINCT d AS daddy
  • #27: The insight here is that we have a typical young father who buys beer, nappies and a game console simply by reducing subgraph We have a pattern to search for
  • #28: We knew it was young fathers, but I bet your model would classify them as lazy, drunken, gamers right?
  • #29: Now we look for young fathers – implied by beer and nappies purchases – who haven’t bought a game console.
  • #30: Turn it to text. And…
  • #33: Neo4j 2.0: MATCH (u:User), (n:ProductType), (b:ProductType), (x:ProductType) WHERE n.name = "nappies" AND b.name = "Beer" AND x.name = "Xbox" AND (u)-[:BOUGHT]->()<-[:MEMBER_OF]-(n) AND (u)-[:BOUGHT]->()<-[:MEMBER_OF]-(b) AND NOT((u)-[:BOUGHT]->()<-[:MEMBER_OF]-(x)) RETURN u
  • #34: Neo4j 2.0: MATCH (u:User), (n:ProductType), (b:ProductType), (x:ProductType) WHERE n.name = "nappies" AND b.name = "Beer" AND x.name = "Xbox" AND (u)-[:BOUGHT]->()<-[:MEMBER_OF]-(n) AND (u)-[:BOUGHT]->()<-[:MEMBER_OF]-(b) AND NOT((u)-[:BOUGHT]->()<-[:MEMBER_OF]-(x)) RETURN u
  • #35: Neo4j 2.0: MATCH (u:User), (n:ProductType), (b:ProductType), (x:ProductType) WHERE n.name = "nappies" AND b.name = "Beer" AND x.name = "Xbox" AND (u)-[:BOUGHT]->()<-[:MEMBER_OF]-(n) AND (u)-[:BOUGHT]->()<-[:MEMBER_OF]-(b) AND NOT((u)-[:BOUGHT]->()<-[:MEMBER_OF]-(x)) RETURN u
  • #36: Neo4j 2.0: MATCH (u:User), (n:ProductType), (b:ProductType), (x:ProductType) WHERE n.name = "nappies" AND b.name = "Beer" AND x.name = "Xbox" AND (u)-[:BOUGHT]->()<-[:MEMBER_OF]-(n) AND (u)-[:BOUGHT]->()<-[:MEMBER_OF]-(b) AND NOT((u)-[:BOUGHT]->()<-[:MEMBER_OF]-(x)) RETURN u
  • #42: This is fast: query latency is proportional to the amount of graph searched
  • #43: Now called “network science”
  • #44: First we need to talk about some local properties
  • #45: A triadic closure is a local property of (social) graphs whereby if two nodes are connected via a path involving a third node, there is an increased likelihood that the two nodes will become directly connected in future. This is a familiar enough situation for us in a social setting whereby if we happen to be friends with two people, ultimately there's an increased chance that those people will become direct friends too, since by being our friend in the first place, it's an indication of social similarity and suitability. It’s called triadic closure, because we try to close the triangle.
  • #46: We see this all the time – it’s likely that if we have two friends, that they will also become at least acquaintances and potentially friends themselves! In general, if a node A has relationships to B & C then the relationship between B&C is likely to form – especially if the existing relationships are both strong. This is an incredibly strong assertion and will not be typically upheld by all subgraphs in a graph. Nonetheless it is sufficiently commonplace (particularly in social networks) to be trusted as a predictive aid.
  • #47: Sentiment plays a role in how closures form too – there is a notion of balance.
  • #48: From a triadic closure perspective this is OK, but intuitively it seems odd. Cartman’s friends shouldn’t be friends with his enemies. Nor should Cartman’s enemies be friends with his friends.
  • #49: This makes sense – Cartman’s friend Craig is also an enemy of Cartman’s enemy Tweek Two negative sentiments and one positive sentiment is a balanced structure – and it makes sense too since we gang up with our friends on our poor beleaguered enemy
  • #50: Is this true? Yes. Is it nice? No. Is it realistic? Oh yes.
  • #51: Another balanced – and more pleasant – arrangement is for three positive sentiments, in this case mutual friends.
  • #54: A starting point for a network of friends and enemies 100 years on from the armistice Red links indicate enemy of relationship Black links indicate friend of relationship The Three Emperor’s league
  • #55: Italy forms the with Austria and Germany – a balanced +++ triadic closure If Italy had made only a single alliance (or enemy) it would have been unstable and another relationship would be likely to form anyway! Triple Alliance
  • #56: Russia becomes hostile to Austria and Germany – a balance --+ d triadic closure becomes agnostic towards France. German-Russian Lapse
  • #57: The French and Russians ally, forming a balanced --+ triadic closure with the UK French-Russian Alliance
  • #58: The UK and France enter into the famous Entente Cordiale This produces an unbalanced ++- triadic closure with Russia, and the graph doesn’t like it.
  • #59: The British and Russians form an alliance, thereby changing their previously unbalanced triadic closure into a balanced one. Other local pressures on the graph make other closures form. Italy becomes hostile to Russia, forming a balanced --+ closure with the France, and another balanced --+ closure with the UK. Germany and the UK become hostile forming a balanced --+ closure with Austria and another balanced --+ closure with Italy British-Russian Alliance
  • #60: That WWI can be predicted without domain knowledge by iterating a graph and applying local structural constraints is nothing short of astonishing to me. Note how the network slides into a balanced labeling — and into World War I.
  • #61: A very surprising result: graphs don’t know about human conflicts.
  • #63: In this case the string triadic closure property still holds – though it is a weak link that characterises the relationship between Stan and Cartman. Given a starting graph, we can apply this simple local principal to see how it would evolve.
  • #64: In this case the string triadic closure property still holds – though it is a weak link that characterises the relationship between Stan and Cartman. Given a starting graph, we can apply this simple local principal to see how it would evolve.
  • #66: A local bridge acts as a link – perhaps the only realistic link - between two otherwise distant (or separate) subgraphs. Local bridges are semantically rich – they provide conduits for information flow between otherwise independent groups. In this case DATING is a local bridge – it must also be a weak relationship according to our definition of a local bridge Intuitively this makes sense – your girl/boyfriend is rather less important at age 8 than your regular friends, IIRC.
  • #67: How do we identify local bridges? Any weak link which would cause a component of the graph to become disconnected. Being able to identify local bridges is important – in this case it’s the only know conduit to allow the girls and boys to communicate. In real life local bridges are apparent in your organisation as experts (or managers); appear as nexus in fraud cases;
  • #68: Zachary in the Journal of Anthropological Research 1977 Intuitively we can see “clumps” in this graph. But how do we separate them out? It’s called minimum cut.
  • #69: What’s interesting is that it’s mechanical – no domain knowledge is necessary. There’s only one failure with the method Zachary chose to partition the graph: node 9 should have gone to the instructor’s club but instead went with the original president of the club (node 34). Why? Because the student was three weeks away from completing a four-year quest to obtain a black belt, which he could only do with the instructor (node 1) Other minimum cut approaches might deliver slightly different results, but on the whole it’s amazing you get such insight from an algorithm!
  • #70: But is there enough information in the graph itself to predict the schism?
  • #71: But is there enough information in the graph itself to predict the schism?
  • #72: Actually neo4j already has a bunch of these algorithms. Call them easily from Cypher Emergent intelligence from the graph!
  • #73: Efficiency for graph operations is paramount. You don’t need huge macho clusters to do this.
  • #74: Large payment provider, transaction history A 300M node, ~18B rel graph pageranked with 20 iterations in less than 2 hours using the graph algos. On commodity hardware.
  • #75: Contemporary AI
  • #76: Graph structure itself is rich. In this example we don’t need to know the content of the messages to know they’re spam at high confidence, just their position in the graph. Mine a vector of graph features, feed it into the trained model. Graphs have a key advantage: structural context. Where is the node in the graph? Who are its neighbours? Etc. That richness feeds into the model and makes it better, more accurate, more dependable. PageRank, Degree, Neighbourhood, Colour, etc are all features that improve your ML outcomes but are only available from graphs.
  • #79: ICLR = International Conference on Learning Representations Graph of movies that a user liked. Feed into neural net Graph of users who rated one of those movies. Feed into neural net. Recurse through the data until you get to all the movies and all the users which are just embedding vectors (fancy hashes that place like near like in a vector space). [Can change these vectors for features to avoid cold-starts, without changing overall architecture.] Graph of back-propagated trained neural nets. Incremental: Scalable for both training and prediction. Extensible: bring in other graph layers! Better than collaborative filtering because it can work on any graph, not just bipartite user-likes-movies graphs. E.g. User likes actor in movies with genre – much richer! A bipartite graph, also called a bigraph, is a set of graph vertices decomposed into two disjoint sets such that no two graph vertices within the same set are adjacent. I.e. Users don’t connect to users, only to movies. This is already happening - it’s YouTube’s recommender algorithm.
  • #80: A growing realisation from leaders in the AI community: graph networks as the foundational building block for human-like AI. Argue: combinatorial generalization must be a top priority for AI to achieve human-like abilities. Must be able to compose a finite set of elements in infinite ways (eg like language) We draw analogies by aligning the relational structure between two domains and drawing inferences about one based on corresponding knowledge about the other (Gentner and Markman, 1997; Hummel and Holyoak, 2003). Hierarchies are critical. Inductive bias: how the algorithm prioritises solutions. Relational inductive biases to guide deep learning about entities, relations, and rules for composing them. I.e. the learning understands graphs
  • #81: All this might seem hard at first – we’re used to tables, and our toolkits expect them. Graphs changes this for the better. Once you get graphs, all the other things seem hard
  • #82: “a vast gap between human and machine intelligence remains, especially with respect to efficient, generalizable learning” 70% of graph ML today is still turning graphs to vectors E.g. deep walk - random walk through graph, assign vector node when encountered based on neighborhood 30% is truly graph AI - “differential neural computer” -> discern patterns that users can’t; write sophisticated algorithms (fraud, shortest path, etc) from incentive declarations. E.g. no longer need a human expert to discover the “young father” pattern in our data, the machine learns it’s a valuable query in some contexts. So enjoy using graphs for AI, but please remember graphs for good!