Gluecon InfiniteGraph/DB

The following is an excerpt of presentation
delivered at Gluecon 2010 in Broomfield
Colorado.

The presentation is not a presentation on the
InfiniteGraph/DB, but an overview of
managing distributed graph data in a graph
database.

Copyright © InfiniteGraph

Scaling the [Social] Graph
in the [Cloud]
Darren Wood
Lead Architect, InfiniteGraph

Graph Databases (Quickly)
• Optimized around data relationships
• Small focused API (typically not SQL)
• Typical Use Cases :
– Social Graph Analysis
– Catching Bad Guys (see Booth 16)
– Fraud / Financial (more bad guys)
– Data Intensive Science
– Web / Advertising Analytics


Graph Databases (Almost Done)
Vertex alice = myGraph.addVertex(new Person(“Alice”));
Vertex bob = myGraph.addVertex(new Person(“Bob”));
Vertex carlos = myGraph.addVertex(new Person(“Carlos”));
Vertex charlie = myGraph.addVertex(new Person(“Charlie”));

alice.addEdge(new Meeting(“Denver”, “5-27-10”), bob);
bob.addEdge(new Call(timestamp), carlos);
carlos.addEdge(new Payment(100000.00), charlie);
bob.addEdge(new Call(timestamp), charlie);

Alice Bob Carlos Charlie
Meets Calls Pays

Calls


What’s So Difficult Then ?
• Graphs grow quickly
– Billions of phone calls / day in US
– Emails, social media events, IP Traffic
– Financial transactions
• Some analytics require navigation of large
sections of the graph
• Each step (often) depends on the last
• Must distribute data and go parallel

First Some Good News…
• Graph algorithms naturally branch
• Can be automated or guided

Bob Carlos Charlie
Meets Calls Pays

Alice

Calls
Chuck Dave Eve
Lives Meets
With


Big Distributed Data
(Traditional - Huge Generalization)

Application(s)

Distributed API

Processor Processor Processor Processor

Partition 1 Partition 2 Partition 3 Partition ...n


Big Distributed Data
(Graph)

Application(s)

Distributed API

Processor Processor Processor Processor

Partition 1 Partition 2 Partition 3 Partition ...n


So What Are The Answers?
Best Effort Partitioning

Distributed API

Processor Processor

Partition 1 Partition 2


So What Are The Answers?
The Look Ahead Example

Application

Distributed API

Processor Processor

A C

D
B
E

Y
X

Partition 1 Partition 2


Which of These Work ?
• A carefully orchestrated combination of
various options 
• Can be tuned (degree of look ahead)
• Healing graph can be expensive (write cost)
• This can also be tuned/configured (external
edge thresholds)


Thankyou !
darren.wood@infinitegraph.com

twitter.com/infinitegraph


Gluecon InfiniteGraph/DB

More Related Content

Viewers also liked (8)

Similar to Gluecon InfiniteGraph/DB (20)

Recently uploaded (20)

Gluecon InfiniteGraph/DB