SlideShare a Scribd company logo
Graphs & Big Data
The Power of Graphs &
TheTechnology Ecosystem Around Graphs
Philip Rathle
Sr. Director of Products
philip@neotechnology.com
@prathle
Andreas Kollegger
Product Experience Manager
andreas@neotechnology.com
@akollegger
Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Meetup, Fremont, CA
V
V
V
Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Meetup, Fremont, CA
300 m
Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Meetup, Fremont, CA
Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Meetup, Fremont, CA
Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Meetup, Fremont, CA
V
V
V
VVValue!
Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Meetup, Fremont, CA
Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Meetup, Fremont, CA
Early Adopters of Graph Tech
Evolution of Web Search
Survival of the Fittest
Pre-1999
WWW Indexing
Discrete Data
Evolution of Web Search
Survival of the Fittest
Pre-1999
WWW Indexing
Discrete Data
1999 - 2012
Google Invents
PageRank
Connected Data
(Simple)
Evolution of Web Search
Survival of the Fittest
Pre-1999
WWW Indexing
Discrete Data
1999 - 2012
Google Invents
PageRank
Connected Data
(Simple)
2012-?
Google Knowledge Graph,
Facebook Graph Search
Connected Data
(Rich)
Evolution of Online Recruiting
1999
Keyword Search
Discrete Data
Survival of the Fittest
Evolution of Online Recruiting
1999
Keyword Search
Discrete Data
Survival of the Fittest
2011-12
Social Discovery
Connected Data
Emergent Graph in Other Industries
(Actual Neo4j Graphs)
Content Management
& Access Control
Emergent Graph in Other Industries
(Actual Neo4j Graphs)
Content Management
& Access Control
Emergent Graph in Other Industries
(Actual Neo4j Graphs)
Insurance Risk Analysis
Content Management
& Access Control
Geo Routing
(Public Transport)
Emergent Graph in Other Industries
(Actual Neo4j Graphs)
Insurance Risk Analysis
Content Management
& Access Control
Network Cell Analysis
Geo Routing
(Public Transport)
Emergent Graph in Other Industries
(Actual Neo4j Graphs)
Insurance Risk Analysis
Content Management
& Access Control
Network Asset
Management
Network Cell Analysis
Geo Routing
(Public Transport)
Emergent Graph in Other Industries
(Actual Neo4j Graphs)
Insurance Risk Analysis
Content Management
& Access Control
Network Asset
Management
Network Cell Analysis
Geo Routing
(Public Transport)
BioInformatics
Emergent Graph in Other Industries
(Actual Neo4j Graphs)
Insurance Risk Analysis
Emergent Graph in Other Industries
(Actual Neo4j Graphs)
Web Browsing
Emergent Graph in Other Industries
(Actual Neo4j Graphs)
Web Browsing Portfolio Analytics
Emergent Graph in Other Industries
(Actual Neo4j Graphs)
Web Browsing Portfolio Analytics
Gene Sequencing
Emergent Graph in Other Industries
(Actual Neo4j Graphs)
Web Browsing Portfolio Analytics
Mobile Social ApplicationGene Sequencing
Emergent Graph in Other Industries
(Actual Neo4j Graphs)
What’s a Graph?
LIVES WITH
LOVES
OWNS
DRIVES
LOVES
name:“James”
age: 32
twitter:“@spam”
name:“Mary”
age: 35
property type:“car”
brand:“Volvo”
model:“V70”
Graph data model
AUTHENTICATES
TRANSMITS_DATA
ASSIGNED_TO
CONNECTS_TO
AUTHORIZES
type:“Mobile”
model:“iPhone 5”
IMEI: 99 000107 765315 1
type:“BTS”
Height_m: 9.8
Power: 400A
Backup_Generator:”Y”
device type:“Test”
type:“Trunk”
mbps_capacity:5000
type:“Central Office”
CLLI Code:“PTLEORTEDS0”
Graph data model
Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Meetup, Fremont, CA
Member Group
Member Group
143 Andreas
Member Group
143 Andreas
326
Big Data Fremont
725 Big Data San Francisco
981 Big Data Boston
Member GroupMember_Group
143 Andreas
326
Big Data Fremont
725 Big Data San Francisco
981 Big Data Boston
Member GroupMember_Group
143 Andreas
326
Big Data Fremont
725 Big Data San Francisco
981 Big Data Boston143 981
143 725
143 326
Andreas
Big Data Fremont
Big Data San Francisco
Big Data Boston
143
326
725
981
143 981
143 725
143 326
Andreas
Big Data Fremont
Big Data San Francisco
Big Data Boston
uid: ABK
name: Andreas
uid: FRE
where: Fremont
uid: SFO
where: San Francisco
uid: BOS
where: Boston
Nodes
A Property Graph
uid: ABK
name: Andreas
uid: FRE
where: Fremont
uid: SFO
where: San Francisco
uid: BOS
where: Boston
Nodes
Relationships
member
member
member
A Property Graph
What CanYou Do With
Graphs?
Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Meetup, Fremont, CA
Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Meetup, Fremont, CA
* Cypher query language examplehttp://maxdemarzi.com/?s=facebook
MATCH (me:Person)-[:IS_FRIEND_OF]->(friend),
(friend)-[:LIKES]->(restaurant),
(restaurant)-[:LOCATED_IN]->(city:Location),
(restaurant)-[:SERVES]->(cuisine:Cuisine)
WHERE me.name = 'Philip' AND city.location='New York' AND
cuisine.cuisine='Sushi'
RETURN restaurant.name
* Cypher query language examplehttp://maxdemarzi.com/?s=facebook
Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Meetup, Fremont, CA
Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Meetup, Fremont, CA
What drugs will bind to protein X and not interact with drugY?
Of course.. a graph is a graph is a graph
What drugs will bind to protein X and not interact with drugY?
Of course.. a graph is a graph is a graph
Real-Time/
OLTP
Offline/
Batch
Real-Time/
OLTP
Offline/
Batch
Real-Time/
OLTP
Offline/
Batch
Real-Time/
OLTP
Offline/
Batch
Connected Data
Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Meetup, Fremont, CA
Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Meetup, Fremont, CA
Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Meetup, Fremont, CA
Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Meetup, Fremont, CA
Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Meetup, Fremont, CA
Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Meetup, Fremont, CA
Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Meetup, Fremont, CA
Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Meetup, Fremont, CA
Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Meetup, Fremont, CA
The Zone of SQL Adequacy
Connectedness of Data Set
Performance
SQL database
Requirement of application
The Zone of SQL Adequacy
Connectedness of Data Set
Performance
SQL database
Requirement of application
The Zone of SQL Adequacy
Connectedness of Data Set
Performance
SQL database
Requirement of application
Salary List
ERP
CRM
The Zone of SQL Adequacy
Connectedness of Data Set
Performance
SQL database
Requirement of application
Salary List
ERP
CRM
Network / Data Center
Management
Social
Master Data
Management
Geo
The Zone of SQL Adequacy
Connectedness of Data Set
Performance
SQL database
Requirement of application
Salary List
ERP
CRM
Network / Data Center
Management
Social
Master Data
Management
Geo
Graph Database
Optimal Comfort Zone
Graph Technology
Ecosystem
#1: Graph Local Queries
e.g. Recommendations, Friend-of-Friend, Shortest Path
#1: Graph Local Queries
e.g. Recommendations, Friend-of-Friend, Shortest Path
How many restaurants, on average, has each person liked?
#2: Graph Global Queries
How many restaurants, on average, has each person liked?
#2: Graph Global Queries
Key Graph Analytic Technologies
Data Storage & Processing
• Graph Databases
• Graph Compute Engines
Key Graph Analytic Technologies
Data Storage & Processing
• Graph Databases
• Graph Compute Engines
Programming:
• Graph-Centric APIs & Languages
• Graph Algorithms
Key Graph Analytic Technologies
Data Storage & Processing
• Graph Databases
• Graph Compute Engines
Programming:
• Graph-Centric APIs & Languages
• Graph Algorithms
Tools:
• Visualization Tools & Libraries
• Other
Key Graph Analytic Technologies
What is a
Graph Database
1] Robinson,Webber, Eifrem. Graph Databases. O’Reilly, 2013. p. 5. ISBN-10: 1449356265
What is a
Graph Database
“A graph database... is an online database
management system with CRUD methods
that expose a graph data model”1
1] Robinson,Webber, Eifrem. Graph Databases. O’Reilly, 2013. p. 5. ISBN-10: 1449356265
What is a
Graph Database
“A graph database... is an online database
management system with CRUD methods
that expose a graph data model”1
• Two important properties:
1] Robinson,Webber, Eifrem. Graph Databases. O’Reilly, 2013. p. 5. ISBN-10: 1449356265
What is a
Graph Database
“A graph database... is an online database
management system with CRUD methods
that expose a graph data model”1
• Two important properties:
• Native graph storage engine: written
from the ground up to manage graph data
1] Robinson,Webber, Eifrem. Graph Databases. O’Reilly, 2013. p. 5. ISBN-10: 1449356265
What is a
Graph Database
“A graph database... is an online database
management system with CRUD methods
that expose a graph data model”1
• Two important properties:
• Native graph storage engine: written
from the ground up to manage graph data
• Native graph processing, including
index-free adjacency to facilitate traversals
1] Robinson,Webber, Eifrem. Graph Databases. O’Reilly, 2013. p. 5. ISBN-10: 1449356265
Neo Technology, Inc Confidential
Graph Databases are Designed to:
1. Store inter-connected data
2. Make it easy to make sense of that data
3. Enable extreme-performance operations for:
• Discovery of connected data patterns
• Relatedness queries > depth 1
• Relatedness queries of arbitrary length
4. Make it easy to evolve the database
Neo Technology, Inc Confidential
Top Reasons People Use
Graph Databases
1. Problems with Join performance.
2. Continuously evolving data set
(often involves wide and sparse tables)
3. The Shape of the Domain is
naturally a graph
4. Open-ended business
requirements necessitating fast,
iterative development.
Graph Compute Engine
Processing engine that enables graph global
computational algorithms to be run against
large data sets
Graph Compute Engine
Processing engine that enables graph global
computational algorithms to be run against
large data sets
Graph Mining
Engine
(Working Storage)
In-Memory Processing
System(s)
of Record
Graph Compute
Engine
Data extraction,
transformation,
and load
Graph Database Deployment
Application
Graph Database Deployment
Application
Graph
Database
Cluster
Data Storage &
Business Rules Execution
Graph Database Deployment
Application
Graph
Database
Cluster
Data Storage &
Business Rules Execution
Graph
Visualization
End User Ad-hoc visual navigation &
discovery
Graph Database Deployment
Application
Graph
Database
Cluster
Data Storage &
Business Rules Execution
Reporting
Graph-
Dashboards
&
Ad-hoc
Analysis
Graph
Visualization
End User Ad-hoc visual navigation &
discovery
Graph Database Deployment
Application
Other
Databases
ETL
Graph
Database
Cluster
Data Storage &
Business Rules Execution
Reporting
Graph-
Dashboards
&
Ad-hoc
Analysis
Graph
Visualization
End User Ad-hoc visual navigation &
discovery
Graph Database Deployment
Application
Other
Databases
ETL
Graph
Database
Cluster
Data Storage &
Business Rules Execution
Reporting
Graph-
Dashboards
&
Ad-hoc
Analysis
Graph
Visualization
End User Ad-hoc visual navigation &
discovery
Data Scientist
Ad-Hoc
Analysis
Graph Database Deployment
Application
Other
Databases
ETL
Graph
Database
Cluster
Data Storage &
Business Rules Execution
Reporting
Graph-
Dashboards
&
Ad-hoc
Analysis
Graph
Visualization
End User Ad-hoc visual navigation &
discovery
Bulk Analytic
Infrastructure
(e.g. Graph Compute
Engine)
ETL
Graph Mining &
Aggregation
Data Scientist
Ad-Hoc
Analysis
Neo Technology, Inc Confidential
Graph Dashboards
Fraud Detection & Money Laundering
IT Service Dependencies
Network Cell Analysis
Philip Rathle
Sr. Director of Products
philip@neotechnology.com
@prathle
Andreas Kollegger
Product Experience Manager
andreas@neotechnology.com
@akollegger
Graphs in the Real World
Case Study Examples &Working with Graphs
Cypher
Cypher
LOVES
A B
Graph PatternsASCII art
Cypher
LOVES
A B
Graph PatternsASCII art
MATCH (A) -[:LOVES]-> (B)
Cypher
LOVES
A B
Graph PatternsASCII art
MATCH (A) -[:LOVES]-> (B)
WHERE A.name = "A"
Cypher
LOVES
A B
Graph PatternsASCII art
MATCH (A) -[:LOVES]-> (B)
WHERE A.name = "A"
RETURN B as lover
Social Example
Neo Technology, Inc Confidential
Social Graph - Create
Practical Cypher
CREATE
! (joe:Person {name:"Joe"}),
! (bob:Person {name:"Bob"}),
! (sally:Person {name:"Sally"}),
! (anna:Person {name:"Anna"}),
! (jim:Person {name:"Jim"}),
! (mike:Person {name:"Mike"}),
! (billy:Person {name:"Billy"}),
!
! (joe)-[:KNOWS]->(bob),
! (joe)-[:KNOWS]->(sally),
! (bob)-[:KNOWS]->(sally),
! (sally)-[:KNOWS]->(anna),
! (anna)-[:KNOWS]->(jim),
! (anna)-[:KNOWS]->(mike),
! (jim)-[:KNOWS]->(mike),
! (jim)-[:KNOWS]->(billy)
Neo Technology, Inc Confidential
MATCH (person)-[:KNOWS]-(friend),
(friend)-[:KNOWS]-(foaf)
WHERE person.name = "Joe"
AND NOT(person-[:KNOWS]-foaf)
RETURN foaf
Social Graph - Friends of Joe's Friends
Practical Cypher
foaf
{name:"Anna"}
Neo Technology, Inc Confidential
MATCH (person1)-[:KNOWS]-(friend),
(person2)-[:KNOWS]-(friend)
WHERE person1.name = "Joe"
AND person2.name = "Sally"
RETURN friend
Social Graph - Common Friends
Practical Cypher
friend
{name:"Bob"}
Neo Technology, Inc Confidential
MATCH path = shortestPath(
(person1)-[:KNOWS*..6]-(person2)
)
WHERE person1.name = "Joe"
! AND person2.name = "Billy"
RETURN path
Social Graph - Shortest Path
Practical Cypher
path
{start:"13759",
nodes:["13759","13757","13756","13755","13753"],
length:4,
relationships:["101407","101409","101410","101413"],
end:"13753"}
Industry: Online Job Search
Use case: Social / Recommendations
• Online jobs and career community, providing
anonymized inside information to job seekers
Neo Technology Confidential
Background
Sausalito, CA
Industry: Online Job Search
Use case: Social / Recommendations
• Online jobs and career community, providing
anonymized inside information to job seekers
Business problem
• Wanted to leverage known fact that most jobs are
found through personal & professional connections
• Needed to rely on an existing source of social
network data. Facebook was the ideal choice.
• End users needed to get instant gratification
• Aiming to have the best job search service, in a very
competitive market
Person
Company
KNOW
S
Person
Person
KNOWS
Company
KNOWS
WORKS_AT
WORKS_AT
Neo Technology Confidential
Background
Sausalito, CA
Industry: Online Job Search
Use case: Social / Recommendations
• Online jobs and career community, providing
anonymized inside information to job seekers
Business problem
• Wanted to leverage known fact that most jobs are
found through personal & professional connections
• Needed to rely on an existing source of social
network data. Facebook was the ideal choice.
• End users needed to get instant gratification
• Aiming to have the best job search service, in a very
competitive market
Solution & Benefits
• First-to-market with a product that let users find jobs
through their network of Facebook friends
• Job recommendations served real-time from Neo4j
• Individual Facebook graphs imported real-time into Neo4j
• Glassdoor now stores > 50% of the entire Facebook
social graph
• Neo4j cluster has grown seamlessly, with new instances
being brought online as graph size and load have increased
Person
Company
KNOW
S
Person
Person
KNOWS
Company
KNOWS
WORKS_AT
WORKS_AT
Neo Technology Confidential
Background
Sausalito, CA
Network Management
Example
Neo Technology, Inc Confidential
Industry: Communications
Use case: Network Management
Background
• Second largest communications company in France
• Part ofVivendi Group, partnering withVodafone
Paris, France
Neo Technology, Inc Confidential
Industry: Communications
Use case: Network Management
Background
• Second largest communications company in France
• Part ofVivendi Group, partnering withVodafone
Business problem
• Infrastructure maintenance took one full week to
plan, because of the need to model network impacts
• Needed rapid, automated “what if” analysis to
ensure resilience during unplanned network outages
• Identify weaknesses in the network to uncover the
need for additional redundancy
• Network information spread across > 30 systems,
with daily changes to network infrastructure
• Business needs sometimes changed very rapidly
Router
Service
DEPENDS_O
N
Switch Switch
Router
Fiber Link
Fiber Link
Fiber Link
Oceanfloor
Cable
DEPENDS_ON
DEPENDS_ON
DEPEN
DS_O
N
DEPENDS_ON
DEPENDS_ON
DEPENDS_ON
DEPENDS_ON
DEPENDS_ON
DEPENDS_ON
LINKED
LINKED
LIN
KED
DEPENDS_ON
Paris, France
Neo Technology, Inc Confidential
Industry: Communications
Use case: Network Management
Background
• Second largest communications company in France
• Part ofVivendi Group, partnering withVodafone
Business problem
• Infrastructure maintenance took one full week to
plan, because of the need to model network impacts
• Needed rapid, automated “what if” analysis to
ensure resilience during unplanned network outages
• Identify weaknesses in the network to uncover the
need for additional redundancy
• Network information spread across > 30 systems,
with daily changes to network infrastructure
• Business needs sometimes changed very rapidly
Solution & Benefits
• Flexible network inventory management system, to
support modeling, aggregation & troubleshooting
• Single source of truth (Neo4j) representing the entire
network
• Dynamic system loads data from 30+ systems, and
allows new applications to access network data
• Modeling efforts greatly reduced because of the near
1:1 mapping between the real world and the graph
• Flexible schema highly adaptable to changing business
requirements
Router
Service
DEPENDS_O
N
Switch Switch
Router
Fiber Link
Fiber Link
Fiber Link
Oceanfloor
Cable
DEPENDS_ON
DEPENDS_ON
DEPEN
DS_O
N
DEPENDS_ON
DEPENDS_ON
DEPENDS_ON
DEPENDS_ON
DEPENDS_ON
DEPENDS_ON
LINKED
LINKED
LIN
KED
DEPENDS_ON
Paris, France
Background
• World’s largest provider of IT infrastructure, software
& services
• HP’s Unified Correlation Analyzer (UCA) application is a
key application inside HP’s OSS Assurance portfolio
• Carrier-class resource & service management, problem
determination, root cause & service impact analysis
• Helps communications operators manage large,
complex and fast changing networks
Industry: Web/ISV, Communications
Use case: Network Management
Global (U.S., France)
Background
• World’s largest provider of IT infrastructure, software
& services
• HP’s Unified Correlation Analyzer (UCA) application is a
key application inside HP’s OSS Assurance portfolio
• Carrier-class resource & service management, problem
determination, root cause & service impact analysis
• Helps communications operators manage large,
complex and fast changing networks
Business problem
• Use network topology information to identify root
problems causes on the network
• Simplify alarm handling by human operators
• Automate handling of certain types of alarms Help
operators respond rapidly to network issues
• Filter/group/eliminate redundant Network
Management System alarms by event correlation
Industry: Web/ISV, Communications
Use case: Network Management
Global (U.S., France)
Background
• World’s largest provider of IT infrastructure, software
& services
• HP’s Unified Correlation Analyzer (UCA) application is a
key application inside HP’s OSS Assurance portfolio
• Carrier-class resource & service management, problem
determination, root cause & service impact analysis
• Helps communications operators manage large,
complex and fast changing networks
Business problem
• Use network topology information to identify root
problems causes on the network
• Simplify alarm handling by human operators
• Automate handling of certain types of alarms Help
operators respond rapidly to network issues
• Filter/group/eliminate redundant Network
Management System alarms by event correlation
Solution & Benefits
• Accelerated product development time
• Extremely fast querying of network topology
• Graph representation a perfect domain fit
• 24x7 carrier-grade reliability with Neo4j HA clustering
• Met objective in under 6 months
Industry: Web/ISV, Communications
Use case: Network Management
Global (U.S., France)
Neo Technology, Inc Confidential
CREATE
! (crm {name:"CRM"}),
! (dbvm {name:"Database VM"}),
! (www {name:"Public Website"}),
! (wwwvm {name:"Webserver VM"}),
! (srv1 {name:"Server 1"}),
! (san {name:"SAN"}),
! (srv2 {name:"Server 2"}),
! (crm)-[:DEPENDS_ON]->(dbvm),
! (dbvm)-[:DEPENDS_ON]->(srv2),
! (srv2)-[:DEPENDS_ON]->(san),
! (www)-[:DEPENDS_ON]->(dbvm),
! (www)-[:DEPENDS_ON]->(wwwvm),
! (wwwvm)-[:DEPENDS_ON]->(srv1),
! (srv1)-[:DEPENDS_ON]->(san)
Network Management - Create
Practical Cypher
Neo Technology, Inc Confidential
// Server 1 Outage
MATCH (n)<-[:DEPENDS_ON*]-(upstream)
WHERE n.name = "Server 1"
RETURN upstream
Network Management - Impact Analysis
Practical Cypher
upstream
{name:"Webserver VM"}
{name:"Public Website"}
Neo Technology, Inc Confidential
// Public website dependencies
MATCH (n)-[:DEPENDS_ON*]->(downstream)
WHERE n.name = "Public Website"
RETURN downstream
Network Management - Dependency Analysis
Practical Cypher
downstream
{name:"Database VM"}
{name:"Server 2"}
{name:"SAN"}
{name:"Webserver VM"}
{name:"Server 1"}
Neo Technology, Inc Confidential
// Most depended on component
MATCH (n)<-[:DEPENDS_ON*]-(dependent)
RETURN n,
count(DISTINCT dependent)
AS dependents
ORDER BY dependents DESC
LIMIT 1
Network Management - Statistics
Practical Cypher
n dependents
{name:"SAN"} 6
Logistics
~ Package Routing ~
Background
•One of the world’s largest logistics carriers
•Projected to outgrow capacity of old system
•New parcel routing system
•Single source of truth for entire network
•B2C & B2B parcel tracking
•Real-time routing: up to 5M parcels per day
Industry: Logistics
Use case: Parcel Routing
Background
•One of the world’s largest logistics carriers
•Projected to outgrow capacity of old system
•New parcel routing system
•Single source of truth for entire network
•B2C & B2B parcel tracking
•Real-time routing: up to 5M parcels per day
Business problem
•24x7 availability, year round
•Peak loads of 2500+ parcels per second
•Complex and diverse software stack
•Need predictable performance & linear
scalability
•Daily changes to logistics network: route from
any point, to any point
Industry: Logistics
Use case: Parcel Routing
Background
•One of the world’s largest logistics carriers
•Projected to outgrow capacity of old system
•New parcel routing system
•Single source of truth for entire network
•B2C & B2B parcel tracking
•Real-time routing: up to 5M parcels per day
Business problem
•24x7 availability, year round
•Peak loads of 2500+ parcels per second
•Complex and diverse software stack
•Need predictable performance & linear
scalability
•Daily changes to logistics network: route from
any point, to any point
Solution & Benefits
•Neo4j provides the ideal domain fit:
•a logistics network is a graph
•Extreme availability & performance with Neo4j
clustering
•Hugely simplified queries, vs. relational for
complex routing
•Flexible data model can reflect real-world data
variance much better than relational
•“Whiteboard friendly” model easy to understand
Industry: Logistics
Use case: Parcel Routing
Industry: Communications
Use case: Recommendations
•Cisco.com serves customer and business
customers with Support Services
•Needed real-time recommendations, to
encourage use of online knowledge base
•Cisco had been successfully using Neo4j for its
internal master data management solution.
•Identified a strong fit for online
recommendations
Neo Technology Confidential
Background
San Jose, CA
Cisco.com
Industry: Communications
Use case: Recommendations
•Cisco.com serves customer and business
customers with Support Services
•Needed real-time recommendations, to
encourage use of online knowledge base
•Cisco had been successfully using Neo4j for its
internal master data management solution.
•Identified a strong fit for online
recommendations
Neo Technology Confidential
Background
Business problem
•Call center volumes needed to be lowered by
improving the efficacy of online self service
•Leverage large amounts of knowledge stored in
service cases, solutions, articles, forums, etc.
•Problem resolution times, as well as support
costs, needed to be lowered
Support
Case
Support
Case
Knowledge
Base
Article
Solution
Knowledge
Base
Article
Knowledge
Base
Article
Message
San Jose, CA
Cisco.com
Industry: Communications
Use case: Recommendations
•Cisco.com serves customer and business
customers with Support Services
•Needed real-time recommendations, to
encourage use of online knowledge base
•Cisco had been successfully using Neo4j for its
internal master data management solution.
•Identified a strong fit for online
recommendations
Solution & Benefits
•Cases, solutions, articles, etc. continuously scraped
for cross-reference links, and represented in Neo4j
•Real-time reading recommendations via Neo4j
•Neo4j Enterprise with HA cluster
•The result: customers obtain help faster, with
decreased reliance on customer support
Neo Technology Confidential
Background
Business problem
•Call center volumes needed to be lowered by
improving the efficacy of online self service
•Leverage large amounts of knowledge stored in
service cases, solutions, articles, forums, etc.
•Problem resolution times, as well as support
costs, needed to be lowered
Support
Case
Support
Case
Knowledge
Base
Article
Solution
Knowledge
Base
Article
Knowledge
Base
Article
Message
San Jose, CA
Cisco.com
Consumer Web Giants Depends on Five Graphs
Gartner’s “5 Graphs”
Social Graph
Ref: http://www.gartner.com/id=2081316
Interest Graph
Payment Graph
Intent Graph
Mobile Graph
Questions ?
Innovate. Share. Connect.
San Francisco
October 3 - 4
www.graphconnect.com
(graphs)-[:ARE]->(everywhere)
www.neo4j.org
Recommended Reading & Next Steps
for Learning About Graphs...
www.graphdatabases.com
Get the free ebook!

More Related Content

PDF
Hadoop and Neo4j: A Winning Combination for Bioinformatics
PDF
The Graph Database Universe: Neo4j Overview
PDF
Building a Graph-based Analytics Platform
PDF
Democratizing Data at Airbnb
PDF
Graph Analysis over JSON, Larus
PPTX
The year of the graph: do you really need a graph database? How do you choose...
PPTX
Family tree of data – provenance and neo4j
PDF
An Introduction to Graph: Database, Analytics, and Cloud Services
Hadoop and Neo4j: A Winning Combination for Bioinformatics
The Graph Database Universe: Neo4j Overview
Building a Graph-based Analytics Platform
Democratizing Data at Airbnb
Graph Analysis over JSON, Larus
The year of the graph: do you really need a graph database? How do you choose...
Family tree of data – provenance and neo4j
An Introduction to Graph: Database, Analytics, and Cloud Services

What's hot (20)

PDF
Intro to Neo4j and Graph Databases
PDF
Graph All the Things: An Introduction to Graph Databases
PDF
Relational to Big Graph
PDF
Graph database Use Cases
PPTX
An Introduction to NOSQL, Graph Databases and Neo4j
PDF
Intro to Graphs and Neo4j
PDF
How Graph Databases efficiently store, manage and query connected data at s...
PDF
Neo4j MySql MS-SQL comparison
PPTX
Graph Data: a New Data Management Frontier
PDF
袁晓如:大数据时代可视化和可视分析的机遇与挑战
PDF
Challenges in the Design of a Graph Database Benchmark
PDF
Floods of Twitter Data - StampedeCon 2016
PDF
Bigdata and ai in p2 p industry: Knowledge graph and inference
PPT
Neo4J : Introduction to Graph Database
PPTX
Introduction: Relational to Graphs
PDF
NOSQLEU - Graph Databases and Neo4j
PDF
Intro to Neo4j Webinar
PDF
Building a data processing pipeline in Python
PPT
Graph db
PDF
Introducing Neo4j
Intro to Neo4j and Graph Databases
Graph All the Things: An Introduction to Graph Databases
Relational to Big Graph
Graph database Use Cases
An Introduction to NOSQL, Graph Databases and Neo4j
Intro to Graphs and Neo4j
How Graph Databases efficiently store, manage and query connected data at s...
Neo4j MySql MS-SQL comparison
Graph Data: a New Data Management Frontier
袁晓如:大数据时代可视化和可视分析的机遇与挑战
Challenges in the Design of a Graph Database Benchmark
Floods of Twitter Data - StampedeCon 2016
Bigdata and ai in p2 p industry: Knowledge graph and inference
Neo4J : Introduction to Graph Database
Introduction: Relational to Graphs
NOSQLEU - Graph Databases and Neo4j
Intro to Neo4j Webinar
Building a data processing pipeline in Python
Graph db
Introducing Neo4j
Ad

Viewers also liked (20)

PPTX
Big Data graph Clustering with Laurence O'Toole - Digital Marketing Show, Nov...
PPTX
GraphTalks Hamburg - Einführung in Graphdatenbanken
PDF
GraphDay Stockholm - Levaraging Graph-Technology to fight Financial Fraud
PDF
GraphDay Stockholm - Graphs in Action
PDF
GraphDay Stockholm - iKnow Solutions - The Value Add of Graphs to Analytics a...
PDF
GraphDay Stockholm - Telia Zone
PPTX
Neo4j GraphTalks - Einführung in Graphdatenbanken
PDF
GraphDay Stockholm - Graphs in the Real World: Top Use Cases for Graph Databases
PPTX
The Five Graphs of Government: How Federal Agencies can Utilize Graph Technology
PPTX
GraphTalks Rome - Selecting the right Technology
PDF
GraphTalks Rome - Introducing Neo4j
PDF
GraphTalks Rome - Identity and Access Management
PDF
GraphTalks Rome - The Italian Business Graph
PPTX
Knowledge Architecture: Graphing Your Knowledge
PDF
Working With a Real-World Dataset in Neo4j: Import and Modeling
PDF
Neo4j PartnerDay Amsterdam 2017
PDF
How to Design Retail Recommendation Engines with Neo4j
PPTX
GraphTalks Hamburg - Semantic Data Management
PDF
Neo4j Partner Tag Berlin - Investigating the Panama Papers connections with n...
PPTX
Neo4j Partner Tag Berlin - Potential für System-Integratoren und Berater
Big Data graph Clustering with Laurence O'Toole - Digital Marketing Show, Nov...
GraphTalks Hamburg - Einführung in Graphdatenbanken
GraphDay Stockholm - Levaraging Graph-Technology to fight Financial Fraud
GraphDay Stockholm - Graphs in Action
GraphDay Stockholm - iKnow Solutions - The Value Add of Graphs to Analytics a...
GraphDay Stockholm - Telia Zone
Neo4j GraphTalks - Einführung in Graphdatenbanken
GraphDay Stockholm - Graphs in the Real World: Top Use Cases for Graph Databases
The Five Graphs of Government: How Federal Agencies can Utilize Graph Technology
GraphTalks Rome - Selecting the right Technology
GraphTalks Rome - Introducing Neo4j
GraphTalks Rome - Identity and Access Management
GraphTalks Rome - The Italian Business Graph
Knowledge Architecture: Graphing Your Knowledge
Working With a Real-World Dataset in Neo4j: Import and Modeling
Neo4j PartnerDay Amsterdam 2017
How to Design Retail Recommendation Engines with Neo4j
GraphTalks Hamburg - Semantic Data Management
Neo4j Partner Tag Berlin - Investigating the Panama Papers connections with n...
Neo4j Partner Tag Berlin - Potential für System-Integratoren und Berater
Ad

Similar to Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Meetup, Fremont, CA (20)

PDF
Graph Database Use Cases - StampedeCon 2015
PDF
Using graphs for recommendations
PPTX
Graph all the things - PRathle
PPTX
Calin Constantinov - Neo4j - Keyboards and Mice - Craiova 2016
PPTX
Recommendations and Statistics with Graph Databases
PPTX
Neo4j GraphTour New York_ State of the State_Amit Chaudhry Neo4j
PDF
IRJET- Recommendation System based on Graph Database Techniques
PDF
Neo4j GraphTour Toronto Opening Keynote
PDF
Advanced Analytics: Graph Database Use Cases
PDF
Introduction to graph databases GraphDays
PDF
Findability Day 2014 Neo4j how graph data boost your insights
PPTX
Neo4j GraphTalks - Introduction to GraphDatabases and Neo4j
PPTX
State of the State: What’s Happening in the Database Market?
ODP
Lighting talk neo4j fosdem 2011
PDF
Neo4j: What's Under the Hood & How Knowing This Can Help You
PDF
State of the State: What’s Happening in the Database Market?
PDF
Graph Database in Graph Intelligence
PDF
Complex Telco Networks as Simple Graphs
PPTX
State of the State: What’s Happening in the Database Market?
PPTX
GraphTour Boston - State of the State: Database Market
Graph Database Use Cases - StampedeCon 2015
Using graphs for recommendations
Graph all the things - PRathle
Calin Constantinov - Neo4j - Keyboards and Mice - Craiova 2016
Recommendations and Statistics with Graph Databases
Neo4j GraphTour New York_ State of the State_Amit Chaudhry Neo4j
IRJET- Recommendation System based on Graph Database Techniques
Neo4j GraphTour Toronto Opening Keynote
Advanced Analytics: Graph Database Use Cases
Introduction to graph databases GraphDays
Findability Day 2014 Neo4j how graph data boost your insights
Neo4j GraphTalks - Introduction to GraphDatabases and Neo4j
State of the State: What’s Happening in the Database Market?
Lighting talk neo4j fosdem 2011
Neo4j: What's Under the Hood & How Knowing This Can Help You
State of the State: What’s Happening in the Database Market?
Graph Database in Graph Intelligence
Complex Telco Networks as Simple Graphs
State of the State: What’s Happening in the Database Market?
GraphTour Boston - State of the State: Database Market

More from Neo4j (20)

PDF
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
PDF
Jin Foo - Prospa GraphSummit Sydney Presentation.pdf
PDF
GraphSummit Singapore Master Deck - May 20, 2025
PPTX
Graphs & GraphRAG - Essential Ingredients for GenAI
PPTX
Neo4j Knowledge for Customer Experience.pptx
PPTX
GraphTalk New Zealand - The Art of The Possible.pptx
PDF
Neo4j: The Art of the Possible with Graph
PDF
Smarter Knowledge Graphs For Public Sector
PDF
GraphRAG and Knowledge Graphs Exploring AI's Future
PDF
Matinée GenAI & GraphRAG Paris - Décembre 24
PDF
ANZ Presentation: GraphSummit Melbourne 2024
PDF
Google Cloud Presentation GraphSummit Melbourne 2024: Building Generative AI ...
PDF
Telstra Presentation GraphSummit Melbourne: Optimising Business Outcomes with...
PDF
Hands-On GraphRAG Workshop: GraphSummit Melbourne 2024
PDF
Démonstration Digital Twin Building Wire Management
PDF
Swiss Life - Les graphes au service de la détection de fraude dans le domaine...
PDF
Démonstration Supply Chain - GraphTalk Paris
PDF
The Art of Possible - GraphTalk Paris Opening Session
PPTX
How Siemens bolstered supply chain resilience with graph-powered AI insights ...
PDF
Knowledge Graphs for AI-Ready Data and Enterprise Deployment - Gartner IT Sym...
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
Jin Foo - Prospa GraphSummit Sydney Presentation.pdf
GraphSummit Singapore Master Deck - May 20, 2025
Graphs & GraphRAG - Essential Ingredients for GenAI
Neo4j Knowledge for Customer Experience.pptx
GraphTalk New Zealand - The Art of The Possible.pptx
Neo4j: The Art of the Possible with Graph
Smarter Knowledge Graphs For Public Sector
GraphRAG and Knowledge Graphs Exploring AI's Future
Matinée GenAI & GraphRAG Paris - Décembre 24
ANZ Presentation: GraphSummit Melbourne 2024
Google Cloud Presentation GraphSummit Melbourne 2024: Building Generative AI ...
Telstra Presentation GraphSummit Melbourne: Optimising Business Outcomes with...
Hands-On GraphRAG Workshop: GraphSummit Melbourne 2024
Démonstration Digital Twin Building Wire Management
Swiss Life - Les graphes au service de la détection de fraude dans le domaine...
Démonstration Supply Chain - GraphTalk Paris
The Art of Possible - GraphTalk Paris Opening Session
How Siemens bolstered supply chain resilience with graph-powered AI insights ...
Knowledge Graphs for AI-Ready Data and Enterprise Deployment - Gartner IT Sym...

Recently uploaded (20)

PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Review of recent advances in non-invasive hemoglobin estimation
PPTX
Big Data Technologies - Introduction.pptx
PDF
Electronic commerce courselecture one. Pdf
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Empathic Computing: Creating Shared Understanding
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
cuic standard and advanced reporting.pdf
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Modernizing your data center with Dell and AMD
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Approach and Philosophy of On baking technology
“AI and Expert System Decision Support & Business Intelligence Systems”
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
20250228 LYD VKU AI Blended-Learning.pptx
Review of recent advances in non-invasive hemoglobin estimation
Big Data Technologies - Introduction.pptx
Electronic commerce courselecture one. Pdf
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Empathic Computing: Creating Shared Understanding
The AUB Centre for AI in Media Proposal.docx
Building Integrated photovoltaic BIPV_UPV.pdf
cuic standard and advanced reporting.pdf
Per capita expenditure prediction using model stacking based on satellite ima...
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
MYSQL Presentation for SQL database connectivity
Modernizing your data center with Dell and AMD
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Approach and Philosophy of On baking technology

Graphs & Big Data - Philip Rathle and Andreas Kollegger @ Big Data Science Meetup, Fremont, CA

  • 1. Graphs & Big Data The Power of Graphs & TheTechnology Ecosystem Around Graphs Philip Rathle Sr. Director of Products philip@neotechnology.com @prathle Andreas Kollegger Product Experience Manager andreas@neotechnology.com @akollegger
  • 13. Early Adopters of Graph Tech
  • 14. Evolution of Web Search Survival of the Fittest Pre-1999 WWW Indexing Discrete Data
  • 15. Evolution of Web Search Survival of the Fittest Pre-1999 WWW Indexing Discrete Data 1999 - 2012 Google Invents PageRank Connected Data (Simple)
  • 16. Evolution of Web Search Survival of the Fittest Pre-1999 WWW Indexing Discrete Data 1999 - 2012 Google Invents PageRank Connected Data (Simple) 2012-? Google Knowledge Graph, Facebook Graph Search Connected Data (Rich)
  • 17. Evolution of Online Recruiting 1999 Keyword Search Discrete Data Survival of the Fittest
  • 18. Evolution of Online Recruiting 1999 Keyword Search Discrete Data Survival of the Fittest 2011-12 Social Discovery Connected Data
  • 19. Emergent Graph in Other Industries (Actual Neo4j Graphs)
  • 20. Content Management & Access Control Emergent Graph in Other Industries (Actual Neo4j Graphs)
  • 21. Content Management & Access Control Emergent Graph in Other Industries (Actual Neo4j Graphs) Insurance Risk Analysis
  • 22. Content Management & Access Control Geo Routing (Public Transport) Emergent Graph in Other Industries (Actual Neo4j Graphs) Insurance Risk Analysis
  • 23. Content Management & Access Control Network Cell Analysis Geo Routing (Public Transport) Emergent Graph in Other Industries (Actual Neo4j Graphs) Insurance Risk Analysis
  • 24. Content Management & Access Control Network Asset Management Network Cell Analysis Geo Routing (Public Transport) Emergent Graph in Other Industries (Actual Neo4j Graphs) Insurance Risk Analysis
  • 25. Content Management & Access Control Network Asset Management Network Cell Analysis Geo Routing (Public Transport) BioInformatics Emergent Graph in Other Industries (Actual Neo4j Graphs) Insurance Risk Analysis
  • 26. Emergent Graph in Other Industries (Actual Neo4j Graphs)
  • 27. Web Browsing Emergent Graph in Other Industries (Actual Neo4j Graphs)
  • 28. Web Browsing Portfolio Analytics Emergent Graph in Other Industries (Actual Neo4j Graphs)
  • 29. Web Browsing Portfolio Analytics Gene Sequencing Emergent Graph in Other Industries (Actual Neo4j Graphs)
  • 30. Web Browsing Portfolio Analytics Mobile Social ApplicationGene Sequencing Emergent Graph in Other Industries (Actual Neo4j Graphs)
  • 32. LIVES WITH LOVES OWNS DRIVES LOVES name:“James” age: 32 twitter:“@spam” name:“Mary” age: 35 property type:“car” brand:“Volvo” model:“V70” Graph data model
  • 33. AUTHENTICATES TRANSMITS_DATA ASSIGNED_TO CONNECTS_TO AUTHORIZES type:“Mobile” model:“iPhone 5” IMEI: 99 000107 765315 1 type:“BTS” Height_m: 9.8 Power: 400A Backup_Generator:”Y” device type:“Test” type:“Trunk” mbps_capacity:5000 type:“Central Office” CLLI Code:“PTLEORTEDS0” Graph data model
  • 37. Member Group 143 Andreas 326 Big Data Fremont 725 Big Data San Francisco 981 Big Data Boston
  • 38. Member GroupMember_Group 143 Andreas 326 Big Data Fremont 725 Big Data San Francisco 981 Big Data Boston
  • 39. Member GroupMember_Group 143 Andreas 326 Big Data Fremont 725 Big Data San Francisco 981 Big Data Boston143 981 143 725 143 326
  • 40. Andreas Big Data Fremont Big Data San Francisco Big Data Boston 143 326 725 981 143 981 143 725 143 326
  • 41. Andreas Big Data Fremont Big Data San Francisco Big Data Boston
  • 42. uid: ABK name: Andreas uid: FRE where: Fremont uid: SFO where: San Francisco uid: BOS where: Boston Nodes A Property Graph
  • 43. uid: ABK name: Andreas uid: FRE where: Fremont uid: SFO where: San Francisco uid: BOS where: Boston Nodes Relationships member member member A Property Graph
  • 44. What CanYou Do With Graphs?
  • 47. * Cypher query language examplehttp://maxdemarzi.com/?s=facebook
  • 48. MATCH (me:Person)-[:IS_FRIEND_OF]->(friend), (friend)-[:LIKES]->(restaurant), (restaurant)-[:LOCATED_IN]->(city:Location), (restaurant)-[:SERVES]->(cuisine:Cuisine) WHERE me.name = 'Philip' AND city.location='New York' AND cuisine.cuisine='Sushi' RETURN restaurant.name * Cypher query language examplehttp://maxdemarzi.com/?s=facebook
  • 51. What drugs will bind to protein X and not interact with drugY? Of course.. a graph is a graph is a graph
  • 52. What drugs will bind to protein X and not interact with drugY? Of course.. a graph is a graph is a graph
  • 66. The Zone of SQL Adequacy Connectedness of Data Set Performance SQL database Requirement of application
  • 67. The Zone of SQL Adequacy Connectedness of Data Set Performance SQL database Requirement of application
  • 68. The Zone of SQL Adequacy Connectedness of Data Set Performance SQL database Requirement of application Salary List ERP CRM
  • 69. The Zone of SQL Adequacy Connectedness of Data Set Performance SQL database Requirement of application Salary List ERP CRM Network / Data Center Management Social Master Data Management Geo
  • 70. The Zone of SQL Adequacy Connectedness of Data Set Performance SQL database Requirement of application Salary List ERP CRM Network / Data Center Management Social Master Data Management Geo Graph Database Optimal Comfort Zone
  • 72. #1: Graph Local Queries e.g. Recommendations, Friend-of-Friend, Shortest Path
  • 73. #1: Graph Local Queries e.g. Recommendations, Friend-of-Friend, Shortest Path
  • 74. How many restaurants, on average, has each person liked? #2: Graph Global Queries
  • 75. How many restaurants, on average, has each person liked? #2: Graph Global Queries
  • 76. Key Graph Analytic Technologies
  • 77. Data Storage & Processing • Graph Databases • Graph Compute Engines Key Graph Analytic Technologies
  • 78. Data Storage & Processing • Graph Databases • Graph Compute Engines Programming: • Graph-Centric APIs & Languages • Graph Algorithms Key Graph Analytic Technologies
  • 79. Data Storage & Processing • Graph Databases • Graph Compute Engines Programming: • Graph-Centric APIs & Languages • Graph Algorithms Tools: • Visualization Tools & Libraries • Other Key Graph Analytic Technologies
  • 80. What is a Graph Database 1] Robinson,Webber, Eifrem. Graph Databases. O’Reilly, 2013. p. 5. ISBN-10: 1449356265
  • 81. What is a Graph Database “A graph database... is an online database management system with CRUD methods that expose a graph data model”1 1] Robinson,Webber, Eifrem. Graph Databases. O’Reilly, 2013. p. 5. ISBN-10: 1449356265
  • 82. What is a Graph Database “A graph database... is an online database management system with CRUD methods that expose a graph data model”1 • Two important properties: 1] Robinson,Webber, Eifrem. Graph Databases. O’Reilly, 2013. p. 5. ISBN-10: 1449356265
  • 83. What is a Graph Database “A graph database... is an online database management system with CRUD methods that expose a graph data model”1 • Two important properties: • Native graph storage engine: written from the ground up to manage graph data 1] Robinson,Webber, Eifrem. Graph Databases. O’Reilly, 2013. p. 5. ISBN-10: 1449356265
  • 84. What is a Graph Database “A graph database... is an online database management system with CRUD methods that expose a graph data model”1 • Two important properties: • Native graph storage engine: written from the ground up to manage graph data • Native graph processing, including index-free adjacency to facilitate traversals 1] Robinson,Webber, Eifrem. Graph Databases. O’Reilly, 2013. p. 5. ISBN-10: 1449356265
  • 85. Neo Technology, Inc Confidential Graph Databases are Designed to: 1. Store inter-connected data 2. Make it easy to make sense of that data 3. Enable extreme-performance operations for: • Discovery of connected data patterns • Relatedness queries > depth 1 • Relatedness queries of arbitrary length 4. Make it easy to evolve the database
  • 86. Neo Technology, Inc Confidential Top Reasons People Use Graph Databases 1. Problems with Join performance. 2. Continuously evolving data set (often involves wide and sparse tables) 3. The Shape of the Domain is naturally a graph 4. Open-ended business requirements necessitating fast, iterative development.
  • 87. Graph Compute Engine Processing engine that enables graph global computational algorithms to be run against large data sets
  • 88. Graph Compute Engine Processing engine that enables graph global computational algorithms to be run against large data sets Graph Mining Engine (Working Storage) In-Memory Processing System(s) of Record Graph Compute Engine Data extraction, transformation, and load
  • 91. Graph Database Deployment Application Graph Database Cluster Data Storage & Business Rules Execution Graph Visualization End User Ad-hoc visual navigation & discovery
  • 92. Graph Database Deployment Application Graph Database Cluster Data Storage & Business Rules Execution Reporting Graph- Dashboards & Ad-hoc Analysis Graph Visualization End User Ad-hoc visual navigation & discovery
  • 93. Graph Database Deployment Application Other Databases ETL Graph Database Cluster Data Storage & Business Rules Execution Reporting Graph- Dashboards & Ad-hoc Analysis Graph Visualization End User Ad-hoc visual navigation & discovery
  • 94. Graph Database Deployment Application Other Databases ETL Graph Database Cluster Data Storage & Business Rules Execution Reporting Graph- Dashboards & Ad-hoc Analysis Graph Visualization End User Ad-hoc visual navigation & discovery Data Scientist Ad-Hoc Analysis
  • 95. Graph Database Deployment Application Other Databases ETL Graph Database Cluster Data Storage & Business Rules Execution Reporting Graph- Dashboards & Ad-hoc Analysis Graph Visualization End User Ad-hoc visual navigation & discovery Bulk Analytic Infrastructure (e.g. Graph Compute Engine) ETL Graph Mining & Aggregation Data Scientist Ad-Hoc Analysis
  • 96. Neo Technology, Inc Confidential Graph Dashboards
  • 97. Fraud Detection & Money Laundering
  • 100. Philip Rathle Sr. Director of Products philip@neotechnology.com @prathle Andreas Kollegger Product Experience Manager andreas@neotechnology.com @akollegger Graphs in the Real World Case Study Examples &Working with Graphs
  • 101. Cypher
  • 103. Cypher LOVES A B Graph PatternsASCII art MATCH (A) -[:LOVES]-> (B)
  • 104. Cypher LOVES A B Graph PatternsASCII art MATCH (A) -[:LOVES]-> (B) WHERE A.name = "A"
  • 105. Cypher LOVES A B Graph PatternsASCII art MATCH (A) -[:LOVES]-> (B) WHERE A.name = "A" RETURN B as lover
  • 107. Neo Technology, Inc Confidential Social Graph - Create Practical Cypher CREATE ! (joe:Person {name:"Joe"}), ! (bob:Person {name:"Bob"}), ! (sally:Person {name:"Sally"}), ! (anna:Person {name:"Anna"}), ! (jim:Person {name:"Jim"}), ! (mike:Person {name:"Mike"}), ! (billy:Person {name:"Billy"}), ! ! (joe)-[:KNOWS]->(bob), ! (joe)-[:KNOWS]->(sally), ! (bob)-[:KNOWS]->(sally), ! (sally)-[:KNOWS]->(anna), ! (anna)-[:KNOWS]->(jim), ! (anna)-[:KNOWS]->(mike), ! (jim)-[:KNOWS]->(mike), ! (jim)-[:KNOWS]->(billy)
  • 108. Neo Technology, Inc Confidential MATCH (person)-[:KNOWS]-(friend), (friend)-[:KNOWS]-(foaf) WHERE person.name = "Joe" AND NOT(person-[:KNOWS]-foaf) RETURN foaf Social Graph - Friends of Joe's Friends Practical Cypher foaf {name:"Anna"}
  • 109. Neo Technology, Inc Confidential MATCH (person1)-[:KNOWS]-(friend), (person2)-[:KNOWS]-(friend) WHERE person1.name = "Joe" AND person2.name = "Sally" RETURN friend Social Graph - Common Friends Practical Cypher friend {name:"Bob"}
  • 110. Neo Technology, Inc Confidential MATCH path = shortestPath( (person1)-[:KNOWS*..6]-(person2) ) WHERE person1.name = "Joe" ! AND person2.name = "Billy" RETURN path Social Graph - Shortest Path Practical Cypher path {start:"13759", nodes:["13759","13757","13756","13755","13753"], length:4, relationships:["101407","101409","101410","101413"], end:"13753"}
  • 111. Industry: Online Job Search Use case: Social / Recommendations • Online jobs and career community, providing anonymized inside information to job seekers Neo Technology Confidential Background Sausalito, CA
  • 112. Industry: Online Job Search Use case: Social / Recommendations • Online jobs and career community, providing anonymized inside information to job seekers Business problem • Wanted to leverage known fact that most jobs are found through personal & professional connections • Needed to rely on an existing source of social network data. Facebook was the ideal choice. • End users needed to get instant gratification • Aiming to have the best job search service, in a very competitive market Person Company KNOW S Person Person KNOWS Company KNOWS WORKS_AT WORKS_AT Neo Technology Confidential Background Sausalito, CA
  • 113. Industry: Online Job Search Use case: Social / Recommendations • Online jobs and career community, providing anonymized inside information to job seekers Business problem • Wanted to leverage known fact that most jobs are found through personal & professional connections • Needed to rely on an existing source of social network data. Facebook was the ideal choice. • End users needed to get instant gratification • Aiming to have the best job search service, in a very competitive market Solution & Benefits • First-to-market with a product that let users find jobs through their network of Facebook friends • Job recommendations served real-time from Neo4j • Individual Facebook graphs imported real-time into Neo4j • Glassdoor now stores > 50% of the entire Facebook social graph • Neo4j cluster has grown seamlessly, with new instances being brought online as graph size and load have increased Person Company KNOW S Person Person KNOWS Company KNOWS WORKS_AT WORKS_AT Neo Technology Confidential Background Sausalito, CA
  • 115. Neo Technology, Inc Confidential Industry: Communications Use case: Network Management Background • Second largest communications company in France • Part ofVivendi Group, partnering withVodafone Paris, France
  • 116. Neo Technology, Inc Confidential Industry: Communications Use case: Network Management Background • Second largest communications company in France • Part ofVivendi Group, partnering withVodafone Business problem • Infrastructure maintenance took one full week to plan, because of the need to model network impacts • Needed rapid, automated “what if” analysis to ensure resilience during unplanned network outages • Identify weaknesses in the network to uncover the need for additional redundancy • Network information spread across > 30 systems, with daily changes to network infrastructure • Business needs sometimes changed very rapidly Router Service DEPENDS_O N Switch Switch Router Fiber Link Fiber Link Fiber Link Oceanfloor Cable DEPENDS_ON DEPENDS_ON DEPEN DS_O N DEPENDS_ON DEPENDS_ON DEPENDS_ON DEPENDS_ON DEPENDS_ON DEPENDS_ON LINKED LINKED LIN KED DEPENDS_ON Paris, France
  • 117. Neo Technology, Inc Confidential Industry: Communications Use case: Network Management Background • Second largest communications company in France • Part ofVivendi Group, partnering withVodafone Business problem • Infrastructure maintenance took one full week to plan, because of the need to model network impacts • Needed rapid, automated “what if” analysis to ensure resilience during unplanned network outages • Identify weaknesses in the network to uncover the need for additional redundancy • Network information spread across > 30 systems, with daily changes to network infrastructure • Business needs sometimes changed very rapidly Solution & Benefits • Flexible network inventory management system, to support modeling, aggregation & troubleshooting • Single source of truth (Neo4j) representing the entire network • Dynamic system loads data from 30+ systems, and allows new applications to access network data • Modeling efforts greatly reduced because of the near 1:1 mapping between the real world and the graph • Flexible schema highly adaptable to changing business requirements Router Service DEPENDS_O N Switch Switch Router Fiber Link Fiber Link Fiber Link Oceanfloor Cable DEPENDS_ON DEPENDS_ON DEPEN DS_O N DEPENDS_ON DEPENDS_ON DEPENDS_ON DEPENDS_ON DEPENDS_ON DEPENDS_ON LINKED LINKED LIN KED DEPENDS_ON Paris, France
  • 118. Background • World’s largest provider of IT infrastructure, software & services • HP’s Unified Correlation Analyzer (UCA) application is a key application inside HP’s OSS Assurance portfolio • Carrier-class resource & service management, problem determination, root cause & service impact analysis • Helps communications operators manage large, complex and fast changing networks Industry: Web/ISV, Communications Use case: Network Management Global (U.S., France)
  • 119. Background • World’s largest provider of IT infrastructure, software & services • HP’s Unified Correlation Analyzer (UCA) application is a key application inside HP’s OSS Assurance portfolio • Carrier-class resource & service management, problem determination, root cause & service impact analysis • Helps communications operators manage large, complex and fast changing networks Business problem • Use network topology information to identify root problems causes on the network • Simplify alarm handling by human operators • Automate handling of certain types of alarms Help operators respond rapidly to network issues • Filter/group/eliminate redundant Network Management System alarms by event correlation Industry: Web/ISV, Communications Use case: Network Management Global (U.S., France)
  • 120. Background • World’s largest provider of IT infrastructure, software & services • HP’s Unified Correlation Analyzer (UCA) application is a key application inside HP’s OSS Assurance portfolio • Carrier-class resource & service management, problem determination, root cause & service impact analysis • Helps communications operators manage large, complex and fast changing networks Business problem • Use network topology information to identify root problems causes on the network • Simplify alarm handling by human operators • Automate handling of certain types of alarms Help operators respond rapidly to network issues • Filter/group/eliminate redundant Network Management System alarms by event correlation Solution & Benefits • Accelerated product development time • Extremely fast querying of network topology • Graph representation a perfect domain fit • 24x7 carrier-grade reliability with Neo4j HA clustering • Met objective in under 6 months Industry: Web/ISV, Communications Use case: Network Management Global (U.S., France)
  • 121. Neo Technology, Inc Confidential CREATE ! (crm {name:"CRM"}), ! (dbvm {name:"Database VM"}), ! (www {name:"Public Website"}), ! (wwwvm {name:"Webserver VM"}), ! (srv1 {name:"Server 1"}), ! (san {name:"SAN"}), ! (srv2 {name:"Server 2"}), ! (crm)-[:DEPENDS_ON]->(dbvm), ! (dbvm)-[:DEPENDS_ON]->(srv2), ! (srv2)-[:DEPENDS_ON]->(san), ! (www)-[:DEPENDS_ON]->(dbvm), ! (www)-[:DEPENDS_ON]->(wwwvm), ! (wwwvm)-[:DEPENDS_ON]->(srv1), ! (srv1)-[:DEPENDS_ON]->(san) Network Management - Create Practical Cypher
  • 122. Neo Technology, Inc Confidential // Server 1 Outage MATCH (n)<-[:DEPENDS_ON*]-(upstream) WHERE n.name = "Server 1" RETURN upstream Network Management - Impact Analysis Practical Cypher upstream {name:"Webserver VM"} {name:"Public Website"}
  • 123. Neo Technology, Inc Confidential // Public website dependencies MATCH (n)-[:DEPENDS_ON*]->(downstream) WHERE n.name = "Public Website" RETURN downstream Network Management - Dependency Analysis Practical Cypher downstream {name:"Database VM"} {name:"Server 2"} {name:"SAN"} {name:"Webserver VM"} {name:"Server 1"}
  • 124. Neo Technology, Inc Confidential // Most depended on component MATCH (n)<-[:DEPENDS_ON*]-(dependent) RETURN n, count(DISTINCT dependent) AS dependents ORDER BY dependents DESC LIMIT 1 Network Management - Statistics Practical Cypher n dependents {name:"SAN"} 6
  • 126. Background •One of the world’s largest logistics carriers •Projected to outgrow capacity of old system •New parcel routing system •Single source of truth for entire network •B2C & B2B parcel tracking •Real-time routing: up to 5M parcels per day Industry: Logistics Use case: Parcel Routing
  • 127. Background •One of the world’s largest logistics carriers •Projected to outgrow capacity of old system •New parcel routing system •Single source of truth for entire network •B2C & B2B parcel tracking •Real-time routing: up to 5M parcels per day Business problem •24x7 availability, year round •Peak loads of 2500+ parcels per second •Complex and diverse software stack •Need predictable performance & linear scalability •Daily changes to logistics network: route from any point, to any point Industry: Logistics Use case: Parcel Routing
  • 128. Background •One of the world’s largest logistics carriers •Projected to outgrow capacity of old system •New parcel routing system •Single source of truth for entire network •B2C & B2B parcel tracking •Real-time routing: up to 5M parcels per day Business problem •24x7 availability, year round •Peak loads of 2500+ parcels per second •Complex and diverse software stack •Need predictable performance & linear scalability •Daily changes to logistics network: route from any point, to any point Solution & Benefits •Neo4j provides the ideal domain fit: •a logistics network is a graph •Extreme availability & performance with Neo4j clustering •Hugely simplified queries, vs. relational for complex routing •Flexible data model can reflect real-world data variance much better than relational •“Whiteboard friendly” model easy to understand Industry: Logistics Use case: Parcel Routing
  • 129. Industry: Communications Use case: Recommendations •Cisco.com serves customer and business customers with Support Services •Needed real-time recommendations, to encourage use of online knowledge base •Cisco had been successfully using Neo4j for its internal master data management solution. •Identified a strong fit for online recommendations Neo Technology Confidential Background San Jose, CA Cisco.com
  • 130. Industry: Communications Use case: Recommendations •Cisco.com serves customer and business customers with Support Services •Needed real-time recommendations, to encourage use of online knowledge base •Cisco had been successfully using Neo4j for its internal master data management solution. •Identified a strong fit for online recommendations Neo Technology Confidential Background Business problem •Call center volumes needed to be lowered by improving the efficacy of online self service •Leverage large amounts of knowledge stored in service cases, solutions, articles, forums, etc. •Problem resolution times, as well as support costs, needed to be lowered Support Case Support Case Knowledge Base Article Solution Knowledge Base Article Knowledge Base Article Message San Jose, CA Cisco.com
  • 131. Industry: Communications Use case: Recommendations •Cisco.com serves customer and business customers with Support Services •Needed real-time recommendations, to encourage use of online knowledge base •Cisco had been successfully using Neo4j for its internal master data management solution. •Identified a strong fit for online recommendations Solution & Benefits •Cases, solutions, articles, etc. continuously scraped for cross-reference links, and represented in Neo4j •Real-time reading recommendations via Neo4j •Neo4j Enterprise with HA cluster •The result: customers obtain help faster, with decreased reliance on customer support Neo Technology Confidential Background Business problem •Call center volumes needed to be lowered by improving the efficacy of online self service •Leverage large amounts of knowledge stored in service cases, solutions, articles, forums, etc. •Problem resolution times, as well as support costs, needed to be lowered Support Case Support Case Knowledge Base Article Solution Knowledge Base Article Knowledge Base Article Message San Jose, CA Cisco.com
  • 132. Consumer Web Giants Depends on Five Graphs Gartner’s “5 Graphs” Social Graph Ref: http://www.gartner.com/id=2081316 Interest Graph Payment Graph Intent Graph Mobile Graph
  • 134. Innovate. Share. Connect. San Francisco October 3 - 4 www.graphconnect.com (graphs)-[:ARE]->(everywhere) www.neo4j.org Recommended Reading & Next Steps for Learning About Graphs... www.graphdatabases.com Get the free ebook!