SlideShare a Scribd company logo
GraphDB
Whirlwind Tour
Michael Hunger
Code Days - OOP
(Michael Hunger)-[:WORKS_FOR]->(Neo4j)
michael@neo4j.com | @mesirii | github.com/jexp | jexp.de/blog
Michael Hunger - Head of Developer Relations @Neo4j
Why
Graphs
?
Use
Cases
Data
Model
Query-
ing
Neo4j
Why Graphs?
Because the World is a Graph!
Everything and Everyone is Connected
• people, places, events
• companies, markets
• countries, history, politics
• sciences, art, teaching
• technology, networks, machines, applications, users
• software, code, dependencies, architecture, deployments
• criminals, fraudsters and their behavior
Value from Relationships
Value from Data Relationships
Common Use Cases
Internal Applications
Master Data Management
Network and
IT Operations
Fraud Detection
Customer-Facing Applications
Real-Time Recommendations
Graph-Based Search
Identity and
Access Management
The Rise of Connections in Data
Networks of People Business Processes Knowledge Networks
E.g., Risk management, Supply
chain, Payments
E.g., Employees, Customers,
Suppliers, Partners,
Influencers
E.g., Enterprise content,
Domain specific content,
eCommerce content
Data connections are increasing as rapidly as data volumes
9
Harnessing Connections Drives Business Value
Enhanced Decision
Making
Hyper
Personalization
Massive Data
Integration
Data Driven Discovery
& Innovation
Product Recommendations
Personalized Health Care
Media and Advertising
Fraud Prevention
Network Analysis
Law Enforcement
Drug Discovery
Intelligence and Crime Detection
Product & Process Innovation
360 view of customer
Compliance
Optimize Operations
Connected Data at the Center
AI & Machine
Learning
Price optimization
Product Recommendations
Resource allocation
Digital Transformation Megatrends
Graph Databases are
HOT
Graph Databases Are Hot
Lots of Choice
Newcomers in the last 3 years
• DSE Graph
• Agens Graph
• IBM Graph
• JanusGraph
• Tibco GraphDB
• Microsoft CosmosDB
• TigerGraph
• MemGraph
• AWS Neptune
• SAP HANA Graph
Database Technology Architectures
Graph DB
Connected DataDiscrete Data
Relational DBMSOther NoSQL
Right Tool for the Job
The impact of Graphs
How Graphs are changing the World
GRAPHS
FOR
GOOD
A whirlwind tour of graph databases
A whirlwind tour of graph databases
Neo4j ICIJ Distribution
Better Health with Graphs
Cancer Research - Candiolo Cancer Institute
“Our application relies on complex
hierarchical data, which required a more
flexible model than the one provided by
the traditional relational database
model,” said Andrea Bertotti, MD
neo4j.com/case-studies/candiolo-cancer-institute-ircc/
Graph Databases in Healthcare and Life Sciences
14 Presenters from all around Europe on:
• Genome
• Proteome
• Human Pathway
• Reactome
• SNP
• Drug Discovery
• Metabolic Symbols
• ...
neo4j.com/blog/neo4j-life-sciences-healthcare-workshop-berlin/
DISRUPTION
WITH
GRAPHS
A whirlwind tour of graph databases
BETTER
BUSINESS
WITH GRAPHS
28
Real-Time
Recommendations
Fraud
Detection
Network &
IT Operations
Master Data
Management
Knowledge
Graph
Identity & Access
Management
Common Graph Technology Use Cases
AirBnb
30
• Record “Cyber Monday” sales
• About 35M daily transactions
• Each transaction is 3-22 hops
• Queries executed in 4ms or less
• Replaced IBM Websphere commerce
• 300M pricing operations per day
• 10x transaction throughput on half the
hardware compared to Oracle
• Replaced Oracle database
• Large postal service with over 500k
employees
• Neo4j routes 7M+ packages daily at peak,
with peaks of 5,000+ routing operations per
second.
Handling Large Graph Work Loads for Enterprises
Real-time promotion
recommendations
Marriott’s Real-time
Pricing Engine
Handling Package
Routing in Real-Time
Software
Financial
Services Telecom
Retail &
Consumer Goods
Media &
Entertainment Other Industries
Airbus
NEW
INSIGHTS
WITH GRAPHS
Machine Learning is Based on Graphs
A whirlwind tour of graph databases
A whirlwind tour of graph databases
A whirlwind tour of graph databases
The Property Graph
Model, Import, Query
The Whiteboard Model Is the Physical Model
Eliminates Graph-to-
Relational Mapping
In your data model
Bridge the gap
between business
and IT models
In your application
Greatly reduce need
for application code
CAR
name: “Dan”
born: May 29, 1970
twitter: “@dan”
name: “Ann”
born: Dec 5, 1975
since:
Jan 10, 2011
brand: “Volvo”
model: “V70”
Property Graph Model Components
Nodes
• The objects in the graph
• Can have name-value properties
• Can be labeled
Relationships
• Relate nodes by type and direction
• Can have name-value properties
LOVES
LOVES
LIVES WITH
PERSON PERSON
Cypher: Powerful and Expressive Query Language
MATCH (:Person { name:“Dan”} ) -[:LOVES]-> (:Person { name:“Ann”} )
LOVES
Dan Ann
LABEL PROPERTY
NODE NODE
LABEL PROPERTY
Relational Versus Graph Models
Relational Model Graph Model
KNOWS
ANDREAS
TOBIAS
MICA
DELIA
Person FriendPerson-Friend
ANDREAS
DELIA
TOBIAS
MICA
Retail ...
Recommendations
Our starting point – Northwind ER
Building Relationships in Graphs
ORDERED
Customer OrderOrder
Locate Foreign Keys
(FKs)-[:BECOME]->(Relationships) & Correct Directions
Drop Foreign Keys
Find the Join Tables
Simple Join Tables Becomes Relationships
Attributed Join Tables Become Relationships with Properties
(One) Northwind Graph Model
(:You)-[:QUERY]->(:Data)
in a graph
Who bought Chocolat?
You all know SQL
SELECT distinct c.CompanyName
FROM customers AS c
JOIN orders AS o
ON (c.CustomerID = o.CustomerID)
JOIN order_details AS od
ON (o.OrderID = od.OrderID)
JOIN products AS p
ON (od.ProductID = p.ProductID)
WHERE p.ProductName = 'Chocolat'
Apache Tinkerpop 3.3.x - Gremlin
g = graph.traversal();
g.V().hasLabel('Product')
.has('productName','Chocolat')
.in('INCLUDES')
.in('ORDERED')
.values('companyName').dedup();
W3C Sparql
PREFIX sales_db: <http://sales.northwind.com/>
SELECT distinct ?company_name WHERE {
<sales_db:CompanyName> ?company_name .
?c <sales_db:ORDERED> ?o .
?o <sales_db:ITEMS> ?od .
?od <sales_db:INCLUDES> ?p .
?p <sales_db:ProductName> "Chocolat" .
}
openCypher
MATCH (c:Customer)-[:ORDERED]->(o)
-[:INCLUDES]->(p:Product)
WHERE p.productName = 'Chocolat'
RETURN distinct p.companyName
Basic Pattern: Customers Orders?
MATCH (:Customer {custName:"Delicatessen"} ) -[:ORDERED]-> (order:Order) RETURN order
VAR LABEL
NODE NODE
LABEL PROPERTY
ORDERED
Customer OrderOrder
REL
Basic Query: Customer's Orders?
MATCH (c:Customer)-[:ORDERED]->(order)
WHERE c.customerName = 'Delicatessen'
RETURN *
Basic Query: Customer's Frequent Purchases?
MATCH (c:Customer)-[:ORDERED]->
()-[:INCLUDES]->(p:Product)
WHERE c.customerName = 'Delicatessen'
RETURN p.productName, count(*) AS freq
ORDER BY freq DESC LIMIT 10;
openCypher - Recommendation
MATCH
(c:Customer)-[:ORDERED]->(o1)-[:INCLUDES]->(p),
(peer)-[:ORDERED]->(o2)-[:INCLUDES]->(p),
(peer)-[:ORDERED]->(o3)-[:INCLUDES]->(reco)
WHERE c.customerId = $customerId
AND NOT (c)-[:ORDERED]->()-[:INCLUDES]->(reco)
RETURN reco.productName, count(*) AS freq
ORDER BY freq DESC LIMIT 10
Product Cross-Sell
MATCH
(:Product {productName: 'Chocolat'})<-[:INCLUDES]-(:Order)
<-[:SOLD]-(employee)-[:SOLD]->()-[:INCLUDES]->(cross:Product)
RETURN
employee.firstName, cross.productName,
count(distinct o2) AS freq
ORDER BY freq DESC LIMIT 5;
openCypher
openCypher...
...is a community effort to evolve Cypher, and to
make it the most useful language for querying
property graphs
openCypher implementations
SAP Hana Graph, Redis, Agens Graph, Cypher.PL, Neo4j
github.com/opencypher Language Artifacts
● Cypher 9 specification
● ANTLR and EBNF Grammars
● Formal Semantics (SIGMOD)
● TCK (Cucumber test suite)
● Style Guide
Implementations & Code
● openCypher for Apache Spark
● openCypher for Gremlin
● open source frontend (parser)
● ...
Cypher 10
● Next version of Cypher
● Actively working on natural language specification
● New features
○ Subqueries
○ Multiple graphs
○ Path patterns
○ Configurable pattern matching semantics
Extending Neo4j
Extending Neo4j -
User Defined Procedures & Functions
Neo4j Execution Engine
User Defined
Procedure
User Defined
Functions
Applications
Bolt
User Defined Procedures & Functions let
you write custom code that is:
• Written in any JVM language
• Deployed to the Database
• Accessed by applications via Cypher
Procedure Examples
Built-In
• Metadata Information
• Index Management
• Security
• Cluster Information
• Query Listing &
Cancellation
• ...
Libraries
• APOC (std library)
• Spatial
• RDF (neosemantics)
• NLP
• ...
neo4j.com/developer/procedures-functions
Example: Data(base) Integration
Graph Analytics
Neo4j Graph Algorithms
”Graph analysis is possibly the single most effective
competitive differentiator for organizations pursuing data-
driven operations and decisions“
The Impact of Connected Data
Existing Options (so far)
•Data Processing
•Spark with GraphX, Flink with Gelly
•Gremlin Graph Computer
•Dedicated Graph Processing
•Urika, GraphLab, Giraph, Mosaic, GPS,
Signal-Collect, Gradoop
•Data Scientist Toolkit
•igraph, NetworkX, Boost in Python, R, C
A whirlwind tour of graph databases
Goal: Iterate Quickly
•Combine data from sources into one graph
•Project to relevant subgraphs
•Enrich data with algorithms
•Traverse, collect, filter aggregate
with queries
•Visualize, Explore, Decide, Export
•From all APIs and Tools
A whirlwind tour of graph databases
1. Call as Cypher procedure
2. Pass in specification (Label, Prop, Query) and configuration
3. ~.stream variant returns (a lot) of results
CALL algo.<name>.stream('Label','TYPE',{conf})
YIELD nodeId, score
4. non-stream variant writes results to graph returns statistics
CALL algo.<name>('Label','TYPE',{conf})
Usage
Pass in Cypher statement for node- and relationship-lists.
CALL algo.<name>(
'MATCH ... RETURN id(n)',
'MATCH (n)-->(m)
RETURN id(n) as source,
id(m) as target', {graph:'cypher'})
Cypher Projection
DEMO: OOP
Development
Data Storage and
Business Rules Execution
Data Mining
and Aggregation
Neo4j Fits into Your Environment
Application
Graph Database Cluster
Neo4j Neo4j Neo4j
Ad Hoc
Analysis
Bulk Analytic
Infrastructure
Graph Compute Engine
EDW …
Data
Scientist
End User
Databases
Relational
NoSQL
Hadoop
Official Language Drivers
• Foundational drivers for popular
programming languages
• Bolt: streaming
binary wire protocol
• Authoritative mapping to
native type system,
uniform across drivers
• Pluggable into richer frameworks
JavaScript Java .NET Python PHP, ....
Drivers
Bolt
Bolt + Official Language Drivers
http://neo4j.com/developer/ http://neo4j.com/developer/language-guides/
Using Bolt: Official Language Drivers look all the same
With JavaScript
var driver = Graph.Database.driver("bolt://localhost");
var session = driver.session();
var result = session.run("MATCH (u:User) RETURN u.name");
neo4j.com/developer/spring-data-neo4j
Spring Data Neo4j Neo4j OGM
@NodeEntity
public class Talk {
@Id @GeneratedValue
Long id;
String title;
Slot slot;
Track track;
@Relationship(type="PRESENTS",
direction=INCOMING)
Set<Person> speaker = new HashSet<>();
}
Spring Data Neo4j Neo4j OGM
interface TalkRepository extends Neo4jRepository<Talk, Long> {
@Query("MATCH (t:Talk)<-[rating:RATED]-(user)
WHERE t.id = {talkId} RETURN rating")
List<Rating> getRatings(@Param("talkId") Long talkId);
List<Talk> findByTitleContaining(String title);
}
github.com/neoj4-contrib/neo4j-spark-connector
Neo4j Spark Connector
github.com/neo4j-contrib/neo4j-jdbc
Neo4j JDBC Driver
Neo4j
THE Graph Database Platform
Graph
Transactions
Graph
Analytics
Data Integration
Development
& Admin
Analytics
Tooling
Drivers & APIs Discovery & Visualization
Developers
Admins
Applications Business Users
Data Analysts
Data Scientists
• Operational workloads
• Analytics workloads
Real-time Transactional
and Analytic Processing • Interactive graph exploration
• Graph representation of data
Discovery and
Visualization
• Native property graph model
• Dynamic schema
Agilit
y
• Cypher - Declarative query language
• Procedural language extensions
• Worldwide developer community
Developer Productivity
• 10x less CPU with index-free adjacency
• 10x less hardware than other platforms
Hardware efficiency
Neo4j: Graph Platform
Performance
• Index-free adjacency
• Millions of hops per second
A whirlwind tour of graph databases
Index-free adjacency ensures lightning-
fast retrieval of data and relationships
Native Graph Architecture
Index free adjacency
Unlike other database models Neo4j
connects data as it is stored
Neo4j Query Planner
Cost based Query Planner since Neo4j
• Uses transactional database statistics
• High performance Query Engine
• Bytecode compiled queries
• Future: Parallism
1
2
3
4
5
6
Architecture Components
Index-Free Adjacency
In memory and on flash/disk
vs
ACID Foundation
Required for safe writes
Full-Stack Clustering
Causal consistency
Security
Language, Drivers, Tooling
Developer Experience,
Graph Efficiency
Graph Engine
Cost-Based Optimizer, Graph
Statistics, Cypher Runtime
Hardware Optimizations
For next-gen infrastructure
Neo4j – allows you to connect the dots
• Was built to efficiently
• store,
• query and
• manage highly connected data
• Transactional, ACID
• Real-time OLTP
• Open source
• Highly scalable on few machines
High Query Performance: Some Numbers
• Traverse 2-4M+ relationships per
second and core
• Cost based query optimizer –
complex queries return in
milliseconds
• Import 100K-1M records per second
transactionally
• Bulk import tens of billions of records
in a few hours
Get Started
Neo4j Sandbox
How do I get it? Desktop – Container – Cloud
http://neo4j.com/download/
docker run neo4j
Neo4j Cluster Deployment Options
• Developer: Neo4j Desktop (free Enterprise License)
• On premise – Standalone or via OS package
• Containerized with official Docker Image
•
In the Cloud
• AWS, GCE, Azure
• Using Resource Managers
• DC/OS – Marathon
• Kubernetes
• Docker Swarm
10M+
Downloads
3M+ from Neo4j Distribution
7M+ from Docker
Events
400+
Approximate Number of
Neo4j Events per Year
50k+
Meetups
Number of Meetup
Members Globally
Active Community
50k+
Trained/certified Neo4j
professionals
Trained Developers
Summary: Graphs allow you ...
• Keep your rich data model
• Handle relationships efficiently
• Write queries easily
• Develop applications quickly
• Have fun
Thank You!
Questions?!
@neo4j | neo4j.com
@mesirii | Michael Hunger
Users Love Neo4j
Causal Clustering
Core & Replica Servers Causal Consistency
Causal Clustering - Features
• Two Zones – Core + Edge
• Group of Core Servers – Consistent and Partition tolerant (CP)
• Transactional Writes
• Quorum Writes, Cluster Membership, Leader via Raft Consensus
• Scale out with Read Replicas
• Smart Bolt Drivers with
• Routing, Read & Write Sessions
• Causal Consistency with Bookmarks
• For massive query
throughput
• Read-only replicas
• Not involved in Consensus
Commit
Replica
• Small group of Neo4j
databases
• Fault-tolerant Consensus
Commit
• Responsible for data safety
Core
Writing to the Core Cluster
Neo4j
Driver
✓
✓
✓
Success
Neo4j
Cluster
Application
Server
Neo4j
Driver
Max
Jim
Jane
Mar
k
Routed write statements
driver = GraphDatabase.driver( "bolt+routing://aCoreServer" );
try ( Session session = driver.session( AccessMode.WRITE ) )
{
try ( Transaction tx = session.beginTransaction() )
{
tx.run( "MERGE (user:User {userId: {userId}})",
parameters( "userId", userId ) );
tx.success();
}
}
Bookmark
• Session token
• String (for portability)
• Opaque to application
• Represents ultimate user’s most
recent view of the graph
• More capabilities to come
DataMassive High
3.0
Bigger Clusters
Consensus
Commit
Built-in load
balancing
3.1Causal
Clusteri
ng
Neo4j 3.0 Neo4j 3.1
High Availability
Cluster
Causal Cluster
Master-Slave architecture
Paxos consensus used for
master election
Raft protocol used for leader
election, membership changes
and
commitment of all
transactions
Two part cluster: writeable
Core and read-only read
replicas.
Transaction committed
once written durably on
the master
Transaction committed once written
durably on a majority of the core
members
Practical deployments:
10s servers
Practical deployments: 100s
servers
Causal Clustering - Features
• Two Zones – Core + Edge
• Group of Core Servers – Consistent and Partition tolerant (CP)
• Transactional Writes
• Quorum Writes, Cluster Membership, Leader via Raft Consensus
• Scale out with Read Replicas
• Smart Bolt Drivers with
• Routing, Read & Write Sessions
• Causal Consistency with Bookmarks
• For massive query
throughput
• Read-only replicas
• Not involved in Consensus
Commit
Replica
• Small group of Neo4j
databases
• Fault-tolerant Consensus
Commit
• Responsible for data safety
Core
Writing to the Core Cluster – Raft Consensus
Commits
Neo4j
Driver
✓
✓
✓
Success
Neo4j
Cluster
Application
Server
Neo4j
Driver
Max
Jim
Jane
Mar
k
Routed write statements
driver = GraphDatabase.driver( "bolt+routing://aCoreServer" );
try ( Session session = driver.session( AccessMode.WRITE ) )
{
try ( Transaction tx = session.beginTransaction() )
{
tx.run( "MERGE (user:User {userId: {userId}})“, parameters( "userId",
userId ) );
tx.success();
}
}
Bookmark
• Session token
• String (for portability)
• Opaque to application
• Represents ultimate user’s most
recent view of the graph
• More capabilities to come
DataMassive High
3.0
Bigger Clusters
Consensus
Commit
Built-in load
balancing
3.1Causal
Clusteri
ng
Flexible Authentication Options
Choose authentication method
• Built-in native users repository
Testing/POC, single-instance deployments
• LDAP connector to Active Directory
or openLDAP
Production deployments
• Custom auth provider plugins
Special deployment scenarios
128
Custom
Plugin
Active Directory openLDAP
LDAP
connector
LDAP
connector
Auth Plugin
Extension Module
Built-in
Native Users
Neo4j
Built-in Native Users
Auth Plugin
Extension Module
129
Flexible Authentication Options
LDAP Group to Role Mapping
dbms.security.ldap.authorization.group_to_role_mapping= 
"CN=Neo4j Read Only,OU=groups,DC=example,DC=com" = reader; 
"CN=Neo4j Read-Write,OU=groups,DC=example,DC=com" = publisher; 
"CN=Neo4j Schema Manager,OU=groups,DC=example,DC=com" = architect; 
"CN=Neo4j Administrator,OU=groups,DC=example,DC=com" = admin; 
"CN=Neo4j Procedures,OU=groups,DC=example,DC=com" = allowed_role
./conf/neo4j.conf
CN=Bob Smith
CN=Carl JuniorOU=people
DC=example
DC=com
BASE DN
OU=groups
CN=Neo4j Read Only
CN=Neo4j Read-Write
CN=Neo4j Schema Manager
CN=Neo4j Administrator
CN=Neo4j Procedures
Map to Neo4j
permissions
Use Cases
Case Study: Knowledge Graphs at eBay
Case Study: Knowledge Graphs at eBay
Case Study: Knowledge Graphs at eBay
Case Study: Knowledge Graphs at eBay
Bags
Men’s Backpack
Handbag
Case Study: Knowledge Graphs at eBay
Case studySolving real-time recommendations for the
World’s largest retailer.
Challenge
• In its drive to provide the best web experience for its
customers, Walmart wanted to optimize its online
recommendations.
• Walmart recognized the challenge it faced in delivering
recommendations with traditional relational database
technology.
• Walmart uses Neo4j to quickly query customers’ past
purchases, as well as instantly capture any new interests
shown in the customers’ current online visit – essential
for making real-time recommendations.
Use of Neo4j
“As the current market leader in
graph databases, and with
enterprise features for scalability
and availability, Neo4j is the right
choice to meet our demands”.
- Marcos Vada, Walmart
• With Neo4j, Walmart could substitute a heavy batch
process with a simple and real-time graph database.
Result/Outcome
Case studyeBay Now Tackles eCommerce Delivery Service Routing with
Neo4j
Challenge
• The queries used to select the best courier for eBays
routing system were simply taking too long and they
needed a solution to maintain a competitive service.
• The MySQL joins being used created a code base too slow
and complex to maintain.
• eBay is now using Neo4j’s graph database platform to
redefine e-commerce, by making delivery of online and
mobile orders quick and convenient.
Use of Neo4j
• With Neo4j eBay managed to eliminate the biggest
roadblock between retailers and online shoppers: the
option to have your item delivered the same day.
• The schema-flexible nature of the database allowed easy
extensibility, speeding up development.
• Neo4j solution was more than 1000x faster than the prior
MySQL Soltution.
Our Neo4j solution is literally
thousands of times faster than the
prior MySQL solution, with queries
that require 10-100 times less code.
Result/Outcome
– Volker Pacher, eBay
Top Tier US Retailer
Case studySolving Real-time promotions for a top US
retailer
Challenge
• Suffered significant revenues loss, due to legacy
infrastructure.
• Particularly challenging when handling transaction volumes
on peak shopping occasions such as Thanksgiving and
Cyber Monday.
• Neo4j is used to revolutionize and reinvent its real-time
promotions engine.
• On an average Neo4j processes 90% of this retailer’s 35M+
daily transactions, each 3-22 hops, in 4ms or less.
Use of Neo4j
• Reached an all time high in online revenues, due to the
Neo4j-based friction free solution.
• Neo4j also enabled the company to be one of the first
retailers to provide the same promotions across both online
and traditional retail channels.
“On an average Neo4j processes
90% of this retailer’s 35M+ daily
transactions, each 3-22 hops, in
4ms or less.”
– Top Tier US Retailer
Result/Outcome
Relational DBs Can’t Handle Relationships Well
• Cannot model or store data and relationships
without complexity
• Performance degrades with number and levels
of relationships, and database size
• Query complexity grows with need for JOINs
• Adding new types of data and relationships
requires schema redesign, increasing time to
market
… making traditional databases inappropriate
when data relationships are valuable in real-time
Slow development
Poor performance
Low scalability
Hard to maintain
Unlocking Value from Your Data Relationships
• Model your data as a graph of data
and relationships
• Use relationship information in real-
time to transform your business
• Add new relationships on the fly to
adapt to your changing business
MATCH (sub)-[:REPORTS_TO*0..3]->(boss),
(report)-[:REPORTS_TO*1..3]->(sub)
WHERE boss.name = "Andrew K."
RETURN sub.name AS Subordinate,
count(report) AS Total
Express Complex Queries Easily with Cypher
Find all direct reports and how
many people they manage,
up to 3 levels down
Cypher Query
SQL Query

More Related Content

PDF
How Graph Databases efficiently store, manage and query connected data at s...
PPTX
GraphQL - The new "Lingua Franca" for API-Development
PDF
Neo4j Graph Platform Overview, Kurt Freytag, Neo4j
PDF
Intro to Cypher
PDF
Introducing Neo4j 3.0
PDF
Building Fullstack Graph Applications With Neo4j
PPTX
GraphTour - Neo4j Platform Overview
PDF
Finding the Needle in a Haystack With Knowledge Graphs
How Graph Databases efficiently store, manage and query connected data at s...
GraphQL - The new "Lingua Franca" for API-Development
Neo4j Graph Platform Overview, Kurt Freytag, Neo4j
Intro to Cypher
Introducing Neo4j 3.0
Building Fullstack Graph Applications With Neo4j
GraphTour - Neo4j Platform Overview
Finding the Needle in a Haystack With Knowledge Graphs

What's hot (20)

PDF
Neo4j-Databridge: Enterprise-scale ETL for Neo4j
PDF
GraphConnect 2014 SF: From Zero to Graph in 120: Model
PDF
Training Series: Build APIs with Neo4j GraphQL Library
PPT
Neo4J : Introduction to Graph Database
PDF
Visual, scalable, and manageable data loading to and from Neo4j with Apache Hop
PPTX
Neo4j Import Webinar
PDF
RDBMS to Graph Webinar
PDF
Power of Polyglot Search
PPTX
Introduction to Neo4j and .Net
PDF
Neo4j GraphDay Seattle- Sept19- in the enterprise
PDF
Building a Knowledge Graph using NLP and Ontologies
PDF
Training Week: Build APIs with Neo4j GraphQL Library
PDF
Your Roadmap for An Enterprise Graph Strategy
PDF
Graphs for Finance - AML with Neo4j Graph Data Science
PDF
Neo4j GraphDay Seattle- Sept19- Connected data imperative
PDF
Graph Algorithms for Developers
PDF
Neo4j GraphDay Seattle- Sept19- neo4j basic training
PDF
An Introduction to Graph: Database, Analytics, and Cloud Services
PPTX
Family tree of data – provenance and neo4j
PPTX
Graphs for AI & ML, Jim Webber, Neo4j
Neo4j-Databridge: Enterprise-scale ETL for Neo4j
GraphConnect 2014 SF: From Zero to Graph in 120: Model
Training Series: Build APIs with Neo4j GraphQL Library
Neo4J : Introduction to Graph Database
Visual, scalable, and manageable data loading to and from Neo4j with Apache Hop
Neo4j Import Webinar
RDBMS to Graph Webinar
Power of Polyglot Search
Introduction to Neo4j and .Net
Neo4j GraphDay Seattle- Sept19- in the enterprise
Building a Knowledge Graph using NLP and Ontologies
Training Week: Build APIs with Neo4j GraphQL Library
Your Roadmap for An Enterprise Graph Strategy
Graphs for Finance - AML with Neo4j Graph Data Science
Neo4j GraphDay Seattle- Sept19- Connected data imperative
Graph Algorithms for Developers
Neo4j GraphDay Seattle- Sept19- neo4j basic training
An Introduction to Graph: Database, Analytics, and Cloud Services
Family tree of data – provenance and neo4j
Graphs for AI & ML, Jim Webber, Neo4j
Ad

Similar to A whirlwind tour of graph databases (20)

PPTX
Neo4j Training Introduction
PDF
Introduction to Graph databases and Neo4j (by Stefan Armbruster)
PDF
5.17 - IntroductionToNeo4j-allSlides_1_2022_DanMc.pdf
PDF
Intro to Neo4j and Graph Databases
PDF
Graph Database Use Cases - StampedeCon 2015
PDF
Graph database Use Cases
PDF
GraphSummit Toronto: Keynote - Innovating with Graphs
PPTX
Introduction: Relational to Graphs
PDF
Neo4j GraphTalk Helsinki - Introduction and Graph Use Cases
PDF
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
PDF
Big Data Tel Aviv 2019 v.3.0 I 'Graph database 4 beginners' - Michael Kogan
PPTX
Graph all the things - PRathle
PDF
Amsterdam - The Neo4j Graph Data Platform Today & Tomorrow
PDF
managing big data
PDF
Introduction to Neo4j
PDF
RDBMS to Graphs
PDF
Neo4j Introduction Workshop for Partners
PDF
Einführung in Neo4j
PDF
Graph Databases and Graph Data Science in Neo4j
PPTX
Neo4j GraphTalk Frankfurt - Einführung
Neo4j Training Introduction
Introduction to Graph databases and Neo4j (by Stefan Armbruster)
5.17 - IntroductionToNeo4j-allSlides_1_2022_DanMc.pdf
Intro to Neo4j and Graph Databases
Graph Database Use Cases - StampedeCon 2015
Graph database Use Cases
GraphSummit Toronto: Keynote - Innovating with Graphs
Introduction: Relational to Graphs
Neo4j GraphTalk Helsinki - Introduction and Graph Use Cases
Neo4j Graph Summit 2024 Workshop - EMEA - Breda_and_Munchen.pdf
Big Data Tel Aviv 2019 v.3.0 I 'Graph database 4 beginners' - Michael Kogan
Graph all the things - PRathle
Amsterdam - The Neo4j Graph Data Platform Today & Tomorrow
managing big data
Introduction to Neo4j
RDBMS to Graphs
Neo4j Introduction Workshop for Partners
Einführung in Neo4j
Graph Databases and Graph Data Science in Neo4j
Neo4j GraphTalk Frankfurt - Einführung
Ad

More from jexp (20)

PDF
Looming Marvelous - Virtual Threads in Java Javaland.pdf
PDF
Easing the daily grind with the awesome JDK command line tools
PDF
Looming Marvelous - Virtual Threads in Java
PPTX
GraphConnect 2022 - Top 10 Cypher Tuning Tips & Tricks.pptx
PPTX
Neo4j Connector Apache Spark FiNCENFiles
PPTX
How Graphs Help Investigative Journalists to Connect the Dots
PPTX
The Home Office. Does it really work?
PDF
Polyglot Applications with GraalVM
PPTX
Neo4j Graph Streaming Services with Apache Kafka
PPTX
APOC Pearls - Whirlwind Tour Through the Neo4j APOC Procedures Library
PPTX
Refactoring, 2nd Edition
PPTX
New Features in Neo4j 3.4 / 3.3 - Graph Algorithms, Spatial, Date-Time & Visu...
PDF
Practical Graph Algorithms with Neo4j
PPTX
A Game of Data and GraphQL
PPTX
Querying Graphs with GraphQL
PDF
Graphs & Neo4j - Past Present Future
PDF
Intro to Graphs and Neo4j
PDF
Class graph neo4j and software metrics
PDF
New Neo4j Auto HA Cluster
KEY
Spring Data Neo4j Intro SpringOne 2012
Looming Marvelous - Virtual Threads in Java Javaland.pdf
Easing the daily grind with the awesome JDK command line tools
Looming Marvelous - Virtual Threads in Java
GraphConnect 2022 - Top 10 Cypher Tuning Tips & Tricks.pptx
Neo4j Connector Apache Spark FiNCENFiles
How Graphs Help Investigative Journalists to Connect the Dots
The Home Office. Does it really work?
Polyglot Applications with GraalVM
Neo4j Graph Streaming Services with Apache Kafka
APOC Pearls - Whirlwind Tour Through the Neo4j APOC Procedures Library
Refactoring, 2nd Edition
New Features in Neo4j 3.4 / 3.3 - Graph Algorithms, Spatial, Date-Time & Visu...
Practical Graph Algorithms with Neo4j
A Game of Data and GraphQL
Querying Graphs with GraphQL
Graphs & Neo4j - Past Present Future
Intro to Graphs and Neo4j
Class graph neo4j and software metrics
New Neo4j Auto HA Cluster
Spring Data Neo4j Intro SpringOne 2012

Recently uploaded (20)

PDF
annual-report-2024-2025 original latest.
PDF
Business Analytics and business intelligence.pdf
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPTX
Business Acumen Training GuidePresentation.pptx
PPTX
1_Introduction to advance data techniques.pptx
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PDF
Foundation of Data Science unit number two notes
PPTX
Introduction to Knowledge Engineering Part 1
PPTX
climate analysis of Dhaka ,Banglades.pptx
PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
PDF
Fluorescence-microscope_Botany_detailed content
PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PPT
Quality review (1)_presentation of this 21
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PPT
Reliability_Chapter_ presentation 1221.5784
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
annual-report-2024-2025 original latest.
Business Analytics and business intelligence.pdf
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Business Acumen Training GuidePresentation.pptx
1_Introduction to advance data techniques.pptx
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
Foundation of Data Science unit number two notes
Introduction to Knowledge Engineering Part 1
climate analysis of Dhaka ,Banglades.pptx
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
Fluorescence-microscope_Botany_detailed content
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
Quality review (1)_presentation of this 21
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
Reliability_Chapter_ presentation 1221.5784
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...

A whirlwind tour of graph databases

  • 2. (Michael Hunger)-[:WORKS_FOR]->(Neo4j) michael@neo4j.com | @mesirii | github.com/jexp | jexp.de/blog Michael Hunger - Head of Developer Relations @Neo4j
  • 4. Why Graphs? Because the World is a Graph!
  • 5. Everything and Everyone is Connected • people, places, events • companies, markets • countries, history, politics • sciences, art, teaching • technology, networks, machines, applications, users • software, code, dependencies, architecture, deployments • criminals, fraudsters and their behavior
  • 7. Value from Data Relationships Common Use Cases Internal Applications Master Data Management Network and IT Operations Fraud Detection Customer-Facing Applications Real-Time Recommendations Graph-Based Search Identity and Access Management
  • 8. The Rise of Connections in Data Networks of People Business Processes Knowledge Networks E.g., Risk management, Supply chain, Payments E.g., Employees, Customers, Suppliers, Partners, Influencers E.g., Enterprise content, Domain specific content, eCommerce content Data connections are increasing as rapidly as data volumes
  • 9. 9 Harnessing Connections Drives Business Value Enhanced Decision Making Hyper Personalization Massive Data Integration Data Driven Discovery & Innovation Product Recommendations Personalized Health Care Media and Advertising Fraud Prevention Network Analysis Law Enforcement Drug Discovery Intelligence and Crime Detection Product & Process Innovation 360 view of customer Compliance Optimize Operations Connected Data at the Center AI & Machine Learning Price optimization Product Recommendations Resource allocation Digital Transformation Megatrends
  • 13. Newcomers in the last 3 years • DSE Graph • Agens Graph • IBM Graph • JanusGraph • Tibco GraphDB • Microsoft CosmosDB • TigerGraph • MemGraph • AWS Neptune • SAP HANA Graph
  • 14. Database Technology Architectures Graph DB Connected DataDiscrete Data Relational DBMSOther NoSQL Right Tool for the Job
  • 15. The impact of Graphs How Graphs are changing the World
  • 21. Cancer Research - Candiolo Cancer Institute “Our application relies on complex hierarchical data, which required a more flexible model than the one provided by the traditional relational database model,” said Andrea Bertotti, MD neo4j.com/case-studies/candiolo-cancer-institute-ircc/
  • 22. Graph Databases in Healthcare and Life Sciences 14 Presenters from all around Europe on: • Genome • Proteome • Human Pathway • Reactome • SNP • Drug Discovery • Metabolic Symbols • ... neo4j.com/blog/neo4j-life-sciences-healthcare-workshop-berlin/
  • 26. 28 Real-Time Recommendations Fraud Detection Network & IT Operations Master Data Management Knowledge Graph Identity & Access Management Common Graph Technology Use Cases AirBnb
  • 27. 30 • Record “Cyber Monday” sales • About 35M daily transactions • Each transaction is 3-22 hops • Queries executed in 4ms or less • Replaced IBM Websphere commerce • 300M pricing operations per day • 10x transaction throughput on half the hardware compared to Oracle • Replaced Oracle database • Large postal service with over 500k employees • Neo4j routes 7M+ packages daily at peak, with peaks of 5,000+ routing operations per second. Handling Large Graph Work Loads for Enterprises Real-time promotion recommendations Marriott’s Real-time Pricing Engine Handling Package Routing in Real-Time
  • 28. Software Financial Services Telecom Retail & Consumer Goods Media & Entertainment Other Industries Airbus
  • 30. Machine Learning is Based on Graphs
  • 34. The Property Graph Model, Import, Query
  • 35. The Whiteboard Model Is the Physical Model Eliminates Graph-to- Relational Mapping In your data model Bridge the gap between business and IT models In your application Greatly reduce need for application code
  • 36. CAR name: “Dan” born: May 29, 1970 twitter: “@dan” name: “Ann” born: Dec 5, 1975 since: Jan 10, 2011 brand: “Volvo” model: “V70” Property Graph Model Components Nodes • The objects in the graph • Can have name-value properties • Can be labeled Relationships • Relate nodes by type and direction • Can have name-value properties LOVES LOVES LIVES WITH PERSON PERSON
  • 37. Cypher: Powerful and Expressive Query Language MATCH (:Person { name:“Dan”} ) -[:LOVES]-> (:Person { name:“Ann”} ) LOVES Dan Ann LABEL PROPERTY NODE NODE LABEL PROPERTY
  • 38. Relational Versus Graph Models Relational Model Graph Model KNOWS ANDREAS TOBIAS MICA DELIA Person FriendPerson-Friend ANDREAS DELIA TOBIAS MICA
  • 40. Our starting point – Northwind ER
  • 41. Building Relationships in Graphs ORDERED Customer OrderOrder
  • 45. Find the Join Tables
  • 46. Simple Join Tables Becomes Relationships
  • 47. Attributed Join Tables Become Relationships with Properties
  • 51. You all know SQL SELECT distinct c.CompanyName FROM customers AS c JOIN orders AS o ON (c.CustomerID = o.CustomerID) JOIN order_details AS od ON (o.OrderID = od.OrderID) JOIN products AS p ON (od.ProductID = p.ProductID) WHERE p.ProductName = 'Chocolat'
  • 52. Apache Tinkerpop 3.3.x - Gremlin g = graph.traversal(); g.V().hasLabel('Product') .has('productName','Chocolat') .in('INCLUDES') .in('ORDERED') .values('companyName').dedup();
  • 53. W3C Sparql PREFIX sales_db: <http://sales.northwind.com/> SELECT distinct ?company_name WHERE { <sales_db:CompanyName> ?company_name . ?c <sales_db:ORDERED> ?o . ?o <sales_db:ITEMS> ?od . ?od <sales_db:INCLUDES> ?p . ?p <sales_db:ProductName> "Chocolat" . }
  • 55. Basic Pattern: Customers Orders? MATCH (:Customer {custName:"Delicatessen"} ) -[:ORDERED]-> (order:Order) RETURN order VAR LABEL NODE NODE LABEL PROPERTY ORDERED Customer OrderOrder REL
  • 56. Basic Query: Customer's Orders? MATCH (c:Customer)-[:ORDERED]->(order) WHERE c.customerName = 'Delicatessen' RETURN *
  • 57. Basic Query: Customer's Frequent Purchases? MATCH (c:Customer)-[:ORDERED]-> ()-[:INCLUDES]->(p:Product) WHERE c.customerName = 'Delicatessen' RETURN p.productName, count(*) AS freq ORDER BY freq DESC LIMIT 10;
  • 58. openCypher - Recommendation MATCH (c:Customer)-[:ORDERED]->(o1)-[:INCLUDES]->(p), (peer)-[:ORDERED]->(o2)-[:INCLUDES]->(p), (peer)-[:ORDERED]->(o3)-[:INCLUDES]->(reco) WHERE c.customerId = $customerId AND NOT (c)-[:ORDERED]->()-[:INCLUDES]->(reco) RETURN reco.productName, count(*) AS freq ORDER BY freq DESC LIMIT 10
  • 59. Product Cross-Sell MATCH (:Product {productName: 'Chocolat'})<-[:INCLUDES]-(:Order) <-[:SOLD]-(employee)-[:SOLD]->()-[:INCLUDES]->(cross:Product) RETURN employee.firstName, cross.productName, count(distinct o2) AS freq ORDER BY freq DESC LIMIT 5;
  • 61. openCypher... ...is a community effort to evolve Cypher, and to make it the most useful language for querying property graphs openCypher implementations SAP Hana Graph, Redis, Agens Graph, Cypher.PL, Neo4j
  • 62. github.com/opencypher Language Artifacts ● Cypher 9 specification ● ANTLR and EBNF Grammars ● Formal Semantics (SIGMOD) ● TCK (Cucumber test suite) ● Style Guide Implementations & Code ● openCypher for Apache Spark ● openCypher for Gremlin ● open source frontend (parser) ● ...
  • 63. Cypher 10 ● Next version of Cypher ● Actively working on natural language specification ● New features ○ Subqueries ○ Multiple graphs ○ Path patterns ○ Configurable pattern matching semantics
  • 65. Extending Neo4j - User Defined Procedures & Functions Neo4j Execution Engine User Defined Procedure User Defined Functions Applications Bolt User Defined Procedures & Functions let you write custom code that is: • Written in any JVM language • Deployed to the Database • Accessed by applications via Cypher
  • 66. Procedure Examples Built-In • Metadata Information • Index Management • Security • Cluster Information • Query Listing & Cancellation • ... Libraries • APOC (std library) • Spatial • RDF (neosemantics) • NLP • ... neo4j.com/developer/procedures-functions
  • 69. ”Graph analysis is possibly the single most effective competitive differentiator for organizations pursuing data- driven operations and decisions“ The Impact of Connected Data
  • 70. Existing Options (so far) •Data Processing •Spark with GraphX, Flink with Gelly •Gremlin Graph Computer •Dedicated Graph Processing •Urika, GraphLab, Giraph, Mosaic, GPS, Signal-Collect, Gradoop •Data Scientist Toolkit •igraph, NetworkX, Boost in Python, R, C
  • 72. Goal: Iterate Quickly •Combine data from sources into one graph •Project to relevant subgraphs •Enrich data with algorithms •Traverse, collect, filter aggregate with queries •Visualize, Explore, Decide, Export •From all APIs and Tools
  • 74. 1. Call as Cypher procedure 2. Pass in specification (Label, Prop, Query) and configuration 3. ~.stream variant returns (a lot) of results CALL algo.<name>.stream('Label','TYPE',{conf}) YIELD nodeId, score 4. non-stream variant writes results to graph returns statistics CALL algo.<name>('Label','TYPE',{conf}) Usage
  • 75. Pass in Cypher statement for node- and relationship-lists. CALL algo.<name>( 'MATCH ... RETURN id(n)', 'MATCH (n)-->(m) RETURN id(n) as source, id(m) as target', {graph:'cypher'}) Cypher Projection
  • 78. Data Storage and Business Rules Execution Data Mining and Aggregation Neo4j Fits into Your Environment Application Graph Database Cluster Neo4j Neo4j Neo4j Ad Hoc Analysis Bulk Analytic Infrastructure Graph Compute Engine EDW … Data Scientist End User Databases Relational NoSQL Hadoop
  • 79. Official Language Drivers • Foundational drivers for popular programming languages • Bolt: streaming binary wire protocol • Authoritative mapping to native type system, uniform across drivers • Pluggable into richer frameworks JavaScript Java .NET Python PHP, .... Drivers Bolt
  • 80. Bolt + Official Language Drivers http://neo4j.com/developer/ http://neo4j.com/developer/language-guides/
  • 81. Using Bolt: Official Language Drivers look all the same With JavaScript var driver = Graph.Database.driver("bolt://localhost"); var session = driver.session(); var result = session.run("MATCH (u:User) RETURN u.name");
  • 82. neo4j.com/developer/spring-data-neo4j Spring Data Neo4j Neo4j OGM @NodeEntity public class Talk { @Id @GeneratedValue Long id; String title; Slot slot; Track track; @Relationship(type="PRESENTS", direction=INCOMING) Set<Person> speaker = new HashSet<>(); }
  • 83. Spring Data Neo4j Neo4j OGM interface TalkRepository extends Neo4jRepository<Talk, Long> { @Query("MATCH (t:Talk)<-[rating:RATED]-(user) WHERE t.id = {talkId} RETURN rating") List<Rating> getRatings(@Param("talkId") Long talkId); List<Talk> findByTitleContaining(String title); }
  • 87. Graph Transactions Graph Analytics Data Integration Development & Admin Analytics Tooling Drivers & APIs Discovery & Visualization Developers Admins Applications Business Users Data Analysts Data Scientists
  • 88. • Operational workloads • Analytics workloads Real-time Transactional and Analytic Processing • Interactive graph exploration • Graph representation of data Discovery and Visualization • Native property graph model • Dynamic schema Agilit y • Cypher - Declarative query language • Procedural language extensions • Worldwide developer community Developer Productivity • 10x less CPU with index-free adjacency • 10x less hardware than other platforms Hardware efficiency Neo4j: Graph Platform Performance • Index-free adjacency • Millions of hops per second
  • 90. Index-free adjacency ensures lightning- fast retrieval of data and relationships Native Graph Architecture Index free adjacency Unlike other database models Neo4j connects data as it is stored
  • 91. Neo4j Query Planner Cost based Query Planner since Neo4j • Uses transactional database statistics • High performance Query Engine • Bytecode compiled queries • Future: Parallism
  • 92. 1 2 3 4 5 6 Architecture Components Index-Free Adjacency In memory and on flash/disk vs ACID Foundation Required for safe writes Full-Stack Clustering Causal consistency Security Language, Drivers, Tooling Developer Experience, Graph Efficiency Graph Engine Cost-Based Optimizer, Graph Statistics, Cypher Runtime Hardware Optimizations For next-gen infrastructure
  • 93. Neo4j – allows you to connect the dots • Was built to efficiently • store, • query and • manage highly connected data • Transactional, ACID • Real-time OLTP • Open source • Highly scalable on few machines
  • 94. High Query Performance: Some Numbers • Traverse 2-4M+ relationships per second and core • Cost based query optimizer – complex queries return in milliseconds • Import 100K-1M records per second transactionally • Bulk import tens of billions of records in a few hours
  • 97. How do I get it? Desktop – Container – Cloud http://neo4j.com/download/ docker run neo4j
  • 98. Neo4j Cluster Deployment Options • Developer: Neo4j Desktop (free Enterprise License) • On premise – Standalone or via OS package • Containerized with official Docker Image • In the Cloud • AWS, GCE, Azure • Using Resource Managers • DC/OS – Marathon • Kubernetes • Docker Swarm
  • 99. 10M+ Downloads 3M+ from Neo4j Distribution 7M+ from Docker Events 400+ Approximate Number of Neo4j Events per Year 50k+ Meetups Number of Meetup Members Globally Active Community 50k+ Trained/certified Neo4j professionals Trained Developers
  • 100. Summary: Graphs allow you ... • Keep your rich data model • Handle relationships efficiently • Write queries easily • Develop applications quickly • Have fun
  • 101. Thank You! Questions?! @neo4j | neo4j.com @mesirii | Michael Hunger
  • 103. Causal Clustering Core & Replica Servers Causal Consistency
  • 104. Causal Clustering - Features • Two Zones – Core + Edge • Group of Core Servers – Consistent and Partition tolerant (CP) • Transactional Writes • Quorum Writes, Cluster Membership, Leader via Raft Consensus • Scale out with Read Replicas • Smart Bolt Drivers with • Routing, Read & Write Sessions • Causal Consistency with Bookmarks
  • 105. • For massive query throughput • Read-only replicas • Not involved in Consensus Commit Replica • Small group of Neo4j databases • Fault-tolerant Consensus Commit • Responsible for data safety Core
  • 106. Writing to the Core Cluster Neo4j Driver ✓ ✓ ✓ Success Neo4j Cluster
  • 108. Routed write statements driver = GraphDatabase.driver( "bolt+routing://aCoreServer" ); try ( Session session = driver.session( AccessMode.WRITE ) ) { try ( Transaction tx = session.beginTransaction() ) { tx.run( "MERGE (user:User {userId: {userId}})", parameters( "userId", userId ) ); tx.success(); } }
  • 109. Bookmark • Session token • String (for portability) • Opaque to application • Represents ultimate user’s most recent view of the graph • More capabilities to come
  • 110. DataMassive High 3.0 Bigger Clusters Consensus Commit Built-in load balancing 3.1Causal Clusteri ng
  • 111. Neo4j 3.0 Neo4j 3.1 High Availability Cluster Causal Cluster Master-Slave architecture Paxos consensus used for master election Raft protocol used for leader election, membership changes and commitment of all transactions Two part cluster: writeable Core and read-only read replicas. Transaction committed once written durably on the master Transaction committed once written durably on a majority of the core members Practical deployments: 10s servers Practical deployments: 100s servers
  • 112. Causal Clustering - Features • Two Zones – Core + Edge • Group of Core Servers – Consistent and Partition tolerant (CP) • Transactional Writes • Quorum Writes, Cluster Membership, Leader via Raft Consensus • Scale out with Read Replicas • Smart Bolt Drivers with • Routing, Read & Write Sessions • Causal Consistency with Bookmarks
  • 113. • For massive query throughput • Read-only replicas • Not involved in Consensus Commit Replica • Small group of Neo4j databases • Fault-tolerant Consensus Commit • Responsible for data safety Core
  • 114. Writing to the Core Cluster – Raft Consensus Commits Neo4j Driver ✓ ✓ ✓ Success Neo4j Cluster
  • 116. Routed write statements driver = GraphDatabase.driver( "bolt+routing://aCoreServer" ); try ( Session session = driver.session( AccessMode.WRITE ) ) { try ( Transaction tx = session.beginTransaction() ) { tx.run( "MERGE (user:User {userId: {userId}})“, parameters( "userId", userId ) ); tx.success(); } }
  • 117. Bookmark • Session token • String (for portability) • Opaque to application • Represents ultimate user’s most recent view of the graph • More capabilities to come
  • 118. DataMassive High 3.0 Bigger Clusters Consensus Commit Built-in load balancing 3.1Causal Clusteri ng
  • 119. Flexible Authentication Options Choose authentication method • Built-in native users repository Testing/POC, single-instance deployments • LDAP connector to Active Directory or openLDAP Production deployments • Custom auth provider plugins Special deployment scenarios 128 Custom Plugin Active Directory openLDAP LDAP connector LDAP connector Auth Plugin Extension Module Built-in Native Users Neo4j Built-in Native Users Auth Plugin Extension Module
  • 120. 129 Flexible Authentication Options LDAP Group to Role Mapping dbms.security.ldap.authorization.group_to_role_mapping= "CN=Neo4j Read Only,OU=groups,DC=example,DC=com" = reader; "CN=Neo4j Read-Write,OU=groups,DC=example,DC=com" = publisher; "CN=Neo4j Schema Manager,OU=groups,DC=example,DC=com" = architect; "CN=Neo4j Administrator,OU=groups,DC=example,DC=com" = admin; "CN=Neo4j Procedures,OU=groups,DC=example,DC=com" = allowed_role ./conf/neo4j.conf CN=Bob Smith CN=Carl JuniorOU=people DC=example DC=com BASE DN OU=groups CN=Neo4j Read Only CN=Neo4j Read-Write CN=Neo4j Schema Manager CN=Neo4j Administrator CN=Neo4j Procedures Map to Neo4j permissions
  • 122. Case Study: Knowledge Graphs at eBay
  • 123. Case Study: Knowledge Graphs at eBay
  • 124. Case Study: Knowledge Graphs at eBay
  • 125. Case Study: Knowledge Graphs at eBay Bags
  • 126. Men’s Backpack Handbag Case Study: Knowledge Graphs at eBay
  • 127. Case studySolving real-time recommendations for the World’s largest retailer. Challenge • In its drive to provide the best web experience for its customers, Walmart wanted to optimize its online recommendations. • Walmart recognized the challenge it faced in delivering recommendations with traditional relational database technology. • Walmart uses Neo4j to quickly query customers’ past purchases, as well as instantly capture any new interests shown in the customers’ current online visit – essential for making real-time recommendations. Use of Neo4j “As the current market leader in graph databases, and with enterprise features for scalability and availability, Neo4j is the right choice to meet our demands”. - Marcos Vada, Walmart • With Neo4j, Walmart could substitute a heavy batch process with a simple and real-time graph database. Result/Outcome
  • 128. Case studyeBay Now Tackles eCommerce Delivery Service Routing with Neo4j Challenge • The queries used to select the best courier for eBays routing system were simply taking too long and they needed a solution to maintain a competitive service. • The MySQL joins being used created a code base too slow and complex to maintain. • eBay is now using Neo4j’s graph database platform to redefine e-commerce, by making delivery of online and mobile orders quick and convenient. Use of Neo4j • With Neo4j eBay managed to eliminate the biggest roadblock between retailers and online shoppers: the option to have your item delivered the same day. • The schema-flexible nature of the database allowed easy extensibility, speeding up development. • Neo4j solution was more than 1000x faster than the prior MySQL Soltution. Our Neo4j solution is literally thousands of times faster than the prior MySQL solution, with queries that require 10-100 times less code. Result/Outcome – Volker Pacher, eBay
  • 129. Top Tier US Retailer Case studySolving Real-time promotions for a top US retailer Challenge • Suffered significant revenues loss, due to legacy infrastructure. • Particularly challenging when handling transaction volumes on peak shopping occasions such as Thanksgiving and Cyber Monday. • Neo4j is used to revolutionize and reinvent its real-time promotions engine. • On an average Neo4j processes 90% of this retailer’s 35M+ daily transactions, each 3-22 hops, in 4ms or less. Use of Neo4j • Reached an all time high in online revenues, due to the Neo4j-based friction free solution. • Neo4j also enabled the company to be one of the first retailers to provide the same promotions across both online and traditional retail channels. “On an average Neo4j processes 90% of this retailer’s 35M+ daily transactions, each 3-22 hops, in 4ms or less.” – Top Tier US Retailer Result/Outcome
  • 130. Relational DBs Can’t Handle Relationships Well • Cannot model or store data and relationships without complexity • Performance degrades with number and levels of relationships, and database size • Query complexity grows with need for JOINs • Adding new types of data and relationships requires schema redesign, increasing time to market … making traditional databases inappropriate when data relationships are valuable in real-time Slow development Poor performance Low scalability Hard to maintain
  • 131. Unlocking Value from Your Data Relationships • Model your data as a graph of data and relationships • Use relationship information in real- time to transform your business • Add new relationships on the fly to adapt to your changing business
  • 132. MATCH (sub)-[:REPORTS_TO*0..3]->(boss), (report)-[:REPORTS_TO*1..3]->(sub) WHERE boss.name = "Andrew K." RETURN sub.name AS Subordinate, count(report) AS Total Express Complex Queries Easily with Cypher Find all direct reports and how many people they manage, up to 3 levels down Cypher Query SQL Query