SlideShare a Scribd company logo
INTRO TO
GRAPH DATABASES
by Oxana Goriuc
www.fraugster.com
About Fraugster:
● We are a payment security company
fighting eCommerce payment fraud
● We are a team of 61 employees, of 26
nationalities
● The company was founded in 2014,
based in Berlin, Rocket Tower
www.fraugster.com
Agenda
● Graph Theory basics
● Graph Databases and their applications
● Dgraph Intro
● Dgraph Deployment
● Data Loading
● Query Language
● Visualization tool
www.fraugster.com
NodeEdge
Entity = Connected component
What is a graph?
Graph is a collection of nodes and
relationships between them
txn3
txn4 txn5 txn6
txn2
txn1
phone
email credit card
www.fraugster.com
Graph types
www.fraugster.com
Schema definition
www.fraugster.com
What is a graph database?
● Graph Database uses graph structures
to store and represent the data
● NoSQL
● Support ACID
● Edges (relationships) have type and
direction and can have properties
● Connected component can be usually
retrieved with one query
Social Network Graph
www.fraugster.com
Graph Applications
● Computer Science (flow of computation)
● Map services (intersection of streets is considered to be a node)
● Social Network (Facebook, Twitter)
● Fraud detection (eCommerce, financial fraud)
● Recommendation systems (in eCommerce)
● Search engines (Knowledge Graph)
● Neural networks https://blog.finxter.com/graph-applications/
https://blog.finxter.com/graph-applications/
www.fraugster.com
DGraph Intro
● Written in Go
● Fast and scalable
● Distributed
● Open source
● Build-in visualization tool
● Supports high availability
● Flexible schema
● Facebook Graph Query Language
www.fraugster.com
DGraph Deployment
(on Linux)
1. Install docker and docker-compose
2. Save the snapshot into docker-compose.yml
3. Run “docker-compose up -d”
4. Done!
https://docs.dgraph.io/get-started/
www.fraugster.com
Data loading
Loading into DGraph:
Bulk Loader / Live Loader
RDF
TXN
DGraph
www.fraugster.com
Data loading tips
Instead of finding all the existing links in the data before
loading (which can be done by merging the table with
itself multiple times or loop through the data), try to
represent your linking elements as separate nodes and
make use of deduplication, which happens in Dgraph
automatically
www.fraugster.com
Data loading tech metrics
● Donec risus dolor porta venenatis
● Pharetra luctus felis
● Proin vel tellus in felis volutpat
Max memory usage
RDF row count/size 51.8M, 3.8 GB 225.9M, 15.8 GB
Dataset 11.9 millions2.9 millions of txns
Loading time 7m45s
RSS: 6.5 GB
VMS: 25.2 GB
40m47s
RSS: 8.6 GB
VMS: 59.1 GB
www.fraugster.com
DGraph Query Language
How to query Dgraph:
● Graphical UI
● Client (Go, Python, Java …)
{
bladerunner(func: eq(name, "Blade Runner")) {
uid
name
initial_release_date
netflix_id
}
}
Example: retrieve the information about the movie “Blade Runner”
Output:
● JSON
● Graph Visualization
● Every level should be enclosed with curly brackets
● A function always starts with a name
● uid means unique identifier of a node in Dgraph
● Specific to graph DB queries: recurse, shortest path
● Output of a function can be saved in a variable

More Related Content

PDF
Data Lessons Learned at Scale - Big Data DC
PDF
Data Lessons Learned at Scale
PDF
Big data @ Hootsuite analtyics
ODP
Redis IU
PDF
Introducing MagnetoDB, a key-value storage sevice for OpenStack
PDF
Data exploration using elastic stack for beginners
PDF
Publishing metadata provenance
PPTX
2013 DATA @ NFLX (Tableau User Group)
Data Lessons Learned at Scale - Big Data DC
Data Lessons Learned at Scale
Big data @ Hootsuite analtyics
Redis IU
Introducing MagnetoDB, a key-value storage sevice for OpenStack
Data exploration using elastic stack for beginners
Publishing metadata provenance
2013 DATA @ NFLX (Tableau User Group)

What's hot (20)

PDF
Exploring Graph Use Cases with JanusGraph
PDF
Graph Computing with Apache TinkerPop
PDF
Presto Bangalore Meetup1 Event Listeners@qubole
PDF
Austin bdug 2011_01_27_small_and_big_data
PDF
MongoDB - Warehouse and Aggregator of Events
PDF
DBpedia Viewer - LDOW 2014
ODP
FastReport VCL6 Nuremberg 2018
PDF
Mongodb (1)
PDF
Presto Bangalore Meetup1 Repertoire@Myntra
PDF
Airline Reservations and Routing: A Graph Use Case
PDF
Graphite, an introduction
PDF
SOLR Power FTW: short version
PDF
FIWARE Global Summit - QuantumLeap: Time-series and Geographic Queries
PDF
Webinar slides: Free Monitoring (on Steroids) for MySQL, MariaDB, PostgreSQL ...
PDF
Kafka as an Eventing System to Replatform a Monolith into Microservices
PDF
Time travel and time series analysis with pandas + statsmodels
PDF
Start Flying with Python & Apache TinkerPop
PDF
MongoDB .local Houston 2019: MongoDB Atlas Data Lake Technical Deep Dive
PDF
Kubernetes Config Management Landscape
PDF
TypoScript and EEL outside of Neos [InspiringFlow2013]
Exploring Graph Use Cases with JanusGraph
Graph Computing with Apache TinkerPop
Presto Bangalore Meetup1 Event Listeners@qubole
Austin bdug 2011_01_27_small_and_big_data
MongoDB - Warehouse and Aggregator of Events
DBpedia Viewer - LDOW 2014
FastReport VCL6 Nuremberg 2018
Mongodb (1)
Presto Bangalore Meetup1 Repertoire@Myntra
Airline Reservations and Routing: A Graph Use Case
Graphite, an introduction
SOLR Power FTW: short version
FIWARE Global Summit - QuantumLeap: Time-series and Geographic Queries
Webinar slides: Free Monitoring (on Steroids) for MySQL, MariaDB, PostgreSQL ...
Kafka as an Eventing System to Replatform a Monolith into Microservices
Time travel and time series analysis with pandas + statsmodels
Start Flying with Python & Apache TinkerPop
MongoDB .local Houston 2019: MongoDB Atlas Data Lake Technical Deep Dive
Kubernetes Config Management Landscape
TypoScript and EEL outside of Neos [InspiringFlow2013]
Ad

Similar to Intro To Graph Databases - Oxana Goriuc (20)

PDF
GraphGen: Conducting Graph Analytics over Relational Databases
PDF
GraphGen: Conducting Graph Analytics over Relational Databases
PDF
Spark Driven Big Data Analytics
PDF
Understanding Hadoop
PDF
Introduction to Flink Streaming
PDF
Enabling Multimodel Graphs with Apache TinkerPop
PDF
NetflixOSS Meetup season 3 episode 1
PDF
A primer on building real time data-driven products
PDF
Introduction to Spark Streaming
PDF
Stream Processing in the Cloud - Athens Kubernetes Meetup 16.07.2019
PDF
Analytic Insights in Retail Using Apache Spark with Hari Shreedharan
PDF
Streamsets and spark in Retail
PDF
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
PDF
Scalable Monitoring Using Prometheus with Apache Spark Clusters with Diane F...
PDF
Distributed real time stream processing- why and how
PDF
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
PDF
RAPIDS cuGraph – Accelerating all your Graph needs
PDF
Testing data streaming applications
PDF
Terraforming your Infrastructure on GCP
PPTX
Big Stream Processing Systems, Big Graphs
GraphGen: Conducting Graph Analytics over Relational Databases
GraphGen: Conducting Graph Analytics over Relational Databases
Spark Driven Big Data Analytics
Understanding Hadoop
Introduction to Flink Streaming
Enabling Multimodel Graphs with Apache TinkerPop
NetflixOSS Meetup season 3 episode 1
A primer on building real time data-driven products
Introduction to Spark Streaming
Stream Processing in the Cloud - Athens Kubernetes Meetup 16.07.2019
Analytic Insights in Retail Using Apache Spark with Hari Shreedharan
Streamsets and spark in Retail
MongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas
Scalable Monitoring Using Prometheus with Apache Spark Clusters with Diane F...
Distributed real time stream processing- why and how
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
RAPIDS cuGraph – Accelerating all your Graph needs
Testing data streaming applications
Terraforming your Infrastructure on GCP
Big Stream Processing Systems, Big Graphs
Ad

More from Fraugster (7)

PDF
The Power Of AI In Risk Management
PDF
NOAH16 - Fraugster
PDF
Private Markets - Investment
PDF
AI Presentation - Danial Shaikh
PDF
Aibe Speech - Carlos Espinal
PDF
Natalie Pistunovich - Using Go In Dev Ops
PDF
Noah17 - Fraugster + Ingenico
The Power Of AI In Risk Management
NOAH16 - Fraugster
Private Markets - Investment
AI Presentation - Danial Shaikh
Aibe Speech - Carlos Espinal
Natalie Pistunovich - Using Go In Dev Ops
Noah17 - Fraugster + Ingenico

Recently uploaded (20)

PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PDF
Business Analytics and business intelligence.pdf
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PDF
Introduction to Data Science and Data Analysis
PPTX
climate analysis of Dhaka ,Banglades.pptx
PPTX
STERILIZATION AND DISINFECTION-1.ppthhhbx
PDF
annual-report-2024-2025 original latest.
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PPTX
SAP 2 completion done . PRESENTATION.pptx
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PDF
Fluorescence-microscope_Botany_detailed content
PDF
Introduction to the R Programming Language
PPT
Quality review (1)_presentation of this 21
PPT
ISS -ESG Data flows What is ESG and HowHow
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
Data_Analytics_and_PowerBI_Presentation.pptx
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
Business Analytics and business intelligence.pdf
oil_refinery_comprehensive_20250804084928 (1).pptx
Introduction to Data Science and Data Analysis
climate analysis of Dhaka ,Banglades.pptx
STERILIZATION AND DISINFECTION-1.ppthhhbx
annual-report-2024-2025 original latest.
IBA_Chapter_11_Slides_Final_Accessible.pptx
SAP 2 completion done . PRESENTATION.pptx
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Fluorescence-microscope_Botany_detailed content
Introduction to the R Programming Language
Quality review (1)_presentation of this 21
ISS -ESG Data flows What is ESG and HowHow

Intro To Graph Databases - Oxana Goriuc

  • 2. www.fraugster.com About Fraugster: ● We are a payment security company fighting eCommerce payment fraud ● We are a team of 61 employees, of 26 nationalities ● The company was founded in 2014, based in Berlin, Rocket Tower
  • 3. www.fraugster.com Agenda ● Graph Theory basics ● Graph Databases and their applications ● Dgraph Intro ● Dgraph Deployment ● Data Loading ● Query Language ● Visualization tool
  • 4. www.fraugster.com NodeEdge Entity = Connected component What is a graph? Graph is a collection of nodes and relationships between them txn3 txn4 txn5 txn6 txn2 txn1 phone email credit card
  • 7. www.fraugster.com What is a graph database? ● Graph Database uses graph structures to store and represent the data ● NoSQL ● Support ACID ● Edges (relationships) have type and direction and can have properties ● Connected component can be usually retrieved with one query Social Network Graph
  • 8. www.fraugster.com Graph Applications ● Computer Science (flow of computation) ● Map services (intersection of streets is considered to be a node) ● Social Network (Facebook, Twitter) ● Fraud detection (eCommerce, financial fraud) ● Recommendation systems (in eCommerce) ● Search engines (Knowledge Graph) ● Neural networks https://blog.finxter.com/graph-applications/ https://blog.finxter.com/graph-applications/
  • 9. www.fraugster.com DGraph Intro ● Written in Go ● Fast and scalable ● Distributed ● Open source ● Build-in visualization tool ● Supports high availability ● Flexible schema ● Facebook Graph Query Language
  • 10. www.fraugster.com DGraph Deployment (on Linux) 1. Install docker and docker-compose 2. Save the snapshot into docker-compose.yml 3. Run “docker-compose up -d” 4. Done! https://docs.dgraph.io/get-started/
  • 11. www.fraugster.com Data loading Loading into DGraph: Bulk Loader / Live Loader RDF TXN DGraph
  • 12. www.fraugster.com Data loading tips Instead of finding all the existing links in the data before loading (which can be done by merging the table with itself multiple times or loop through the data), try to represent your linking elements as separate nodes and make use of deduplication, which happens in Dgraph automatically
  • 13. www.fraugster.com Data loading tech metrics ● Donec risus dolor porta venenatis ● Pharetra luctus felis ● Proin vel tellus in felis volutpat Max memory usage RDF row count/size 51.8M, 3.8 GB 225.9M, 15.8 GB Dataset 11.9 millions2.9 millions of txns Loading time 7m45s RSS: 6.5 GB VMS: 25.2 GB 40m47s RSS: 8.6 GB VMS: 59.1 GB
  • 14. www.fraugster.com DGraph Query Language How to query Dgraph: ● Graphical UI ● Client (Go, Python, Java …) { bladerunner(func: eq(name, "Blade Runner")) { uid name initial_release_date netflix_id } } Example: retrieve the information about the movie “Blade Runner” Output: ● JSON ● Graph Visualization ● Every level should be enclosed with curly brackets ● A function always starts with a name ● uid means unique identifier of a node in Dgraph ● Specific to graph DB queries: recurse, shortest path ● Output of a function can be saved in a variable