SlideShare a Scribd company logo
TinkerPop 2020Joshua Shinavier, Ph.D.
Global Graph Summit
Austin, Texas - January 25th
, 2020
○ A brief history of Gremlin
○ Open problems
○ What’s next
TinkerPop 2020
A brief history of Gremlin
Long, long ago...
Long, long ago...
Long, long ago...
TinkerPop 0.x
TinkerPop 0.x
○ “Making stuff for the fun of it..."
○ From RDF to the property graph data model
○ A Turing complete path language for graphs
○ Ripple!
○ Oh, and Gremlin
○ Blueprints
○ “JDBC for graphs”
○ RDF ←→ PG support added early on
TinkerPop 0.x
○ Rexster
○ Server for Blueprints-enabled graphs
○ Predecessor of Gremlin Server
○ Pipes
○ Pull-based dataflow framework
○ Frames
○ Object-oriented graph interfaces using Java annotations
TinkerPop 1.x
TinkerPop 1.x
○ Supported graph back-ends as of May 2012:
○ TinkerGraph (in-memory), Neo4j, OrientDB, DEX (Sparksee)
○ Blueprints adapters for
○ Sesame RDF (RDF4j), JUNG
TinkerPop 2.x
TinkerPop 2.x
○ Furnace
○ Algorithms package built for property graphs
○ Predecessor of graph OLAP in TinkerPop3
○ New language ecosystems
○ Expansion of functionality on top of Blueprints
TinkerPop 3.x
○ Complete rewrite of TinkerPop
○ Focus on scale and performance
○ Symmetry between OLTP and OLAP
○ Gremlin becomes more central
○ Git mono-repo
○ Interfaces with not only graph DBs, but
graph processors
TinkerPop 3.x
○ Not-only-JVM
○ Gremlin in native programming languages
○ Now dozens of graph systems implementing TinkerPop
○ Third-party managed libraries and tools
Apache TinkerPop
Graph systems
○ Alibaba Graph Database
○ Amazon Neptune
○ ArangoDB
○ Bitsy
○ Blazegraph
○ CosmosDB
○ ChronoGraph
○ DSEGraph
○ GRAKN.AI
○ Hadoop (Spark)
○ HGraphDB
○ Huawei Graph Engine Service
○ IBM Graph
○ JanusGraph
○ Neo4j
○ neo4j-gremlin-bolt
○ OrientDB
○ Apache S2Graph
○ Sqlg
○ Stardog
○ TinkerGraph
○ Titan
○ Titan + Tupl
○ Unipop
Query languages, drivers, and GLVs
○ Clojure: ogre
○ Cypher: cypher-for-gremlin
○ Elixir: gremlex
○ Go: grammes, gremgo
○ Haskell: greskell, gremlin-haskell
○ Java: Ferma, gremlin-objects,
Peapod, spring-data-gremlin,
gremlin-driver
○ JavaScript: gremlin-javascript,
gremlin-orm,
gremlin-template-string
○ Kotlin: kotlin-gremlin-ogm
○ .NET: Gremlin.Net, Gremlinq
○ PHP: gremlin-php
○ Python: Goblin, gremlin-python,
gremlin-py, ipython-gremlin,
gremlinclient, gremlin-python, JUGRI,
gremlinrestclient, python-gremlin-rest
○ Ruby: gremlin_client
○ Rust: gremlin-rs
○ Scala: gremlin-scala,
reactive-gremlin,
scalajs-gremlin-client
○ SPARQL: sparql-gremlin
○ SQL: sql-gremlin
○ Typescript: ts-tinkerpop
Open problems
Escape from the JVM
○ TinkerPop originally 100% Java + Groovy
○ Still very JVM-heavy
○ Gremlin-Server is Java-only
○ How to achieve parity across languages?
○ Ideally: complete Gremlin VM in every language ecosystem
○ Code generation?
○ How to generate both:
○ Clean APIs
○ Efficient runtime code
○ ...that fit together?
Making life easier for graph providers
○ Creating TinkerPop implementations
○ Currently a monolithic effort for each language / environment
○ How do we:
○ Ensure consistency across implementations?
○ Reduce the workload?
○ Thoughtful test suite
○ Rigorous in terms of correct operations
○ Does not force functionality that may not fit
○ Types and constraints may help
Network serialization formats
○ GraphML (XML)
○ Widely supported
○ Graphs only
○ GraphSON (JSON)
○ TinkerPop-specific
○ Graphs, elements, paths, etc.
○ {1.0, 2.0, 3.0}
○ GraphBinary
○ TinkerPop-specific
○ Graphs, elements, paths, etc.
○ Good forward-compatibility
○ Gryo (Kryo)
○ JVM only
Network serialization formats
○ GraphML (XML)
○ Widely supported
○ Graphs only
○ GraphSON (JSON)
○ TinkerPop-specific
○ Graphs, elements, paths, etc.
○ {1.0, 2.0, 3.0}
○ GraphBinary
○ TinkerPop-specific
○ Graphs, elements, paths, etc.
○ Good forward-compatibility
○ Gryo (Kryo)
○ JVM only
○ Bit of a format zoo
○ One format to rule them all?
○ Mappings between formats?
○ Will schemas help?
○ How about common RPC formats
○ Thrift, Protobuf, Avro, etc.
○ Property graphs:
○ Strong on intuitiveness
○ Historically weak on schema
○ Lightweight property graph schemas
○ E.g. in JanusGraph, Neo4j, basic Graph.Features
○ Stronger graph schemas
○ RDF triple stores, hypergraph databases, object databases, etc.
○ Schemas facilitate composability of data and queries
○ ...enabling optimizations, mappings, migration, other good stuff
○ What’s the best fit for TinkerPop?
Schemas in TinkerPop
Getting transactions right
○ How to support diverse transactional models?
○ Neo4j is different than JanusGraph is different than...
○ Is there a unified approach to:
○ Threads + queries + transactions?
○ Transactional scope?
○ Transaction failures?
○ Nested transactions?
○ etc.
○ Will functional approaches to concurrency help?
Static analysis for traversals
○ Stop supporting opaque traversals
○ Security issues
○ Portability issues
○ Need a replacement for closures/lambdas
○ “Just write Gremlin”
○ What additional features are required?
Graph stream processing
○ Much of the world’s data is streaming
○ Much of that data describes entities and relationships
○ Decades of research on relational stream processing
○ 10+ years on continuous SPARQL
○ What is continuous Gremlin?
○ (RDF)-[:betterThan]->(PG) for streaming
○ RDF stream := unbounded sequence of triples
○ Property graph stream := ?
○ Need schemas, global identifiers, set operations on graphs
Abstractions
Data models
Query languages
Formal inference
Transformations
Embeddings
Graph +
Relational model
Streams
...
Human and machine knowledge
Knowledge graphs
Enterprise
Personal
Collaborative
Mental representations
Representation learning
Visualization and HCI
...
Processing and performance
Graph...
Ingestion
Generation
Partitioning
Compression
Concurrent systems
Parallel
Distributed
Graph analytics
Hardware acceleration
Benchmarks and metrics
...
The 1010
foot view
What’s next
From Graph.Features to a real type system
○ No existing standard for property graphs
○ Recent community efforts
○ W3C Workshop on Web Standardization for Graph Data (March 2019)
○ Property Graph Schema Working Group (PGSWG)
○ Graph Query Language (GQL)
○ Don’t forget about external data models
○ Relational model
○ RDF and other graph models
○ Data interchange formats (Protocol Buffers, Thrift, Avro, etc.)
○ OO, ER, and semistructured data models
Taming the dragon (connecting 3+ data models)
Taming the dragon (connecting 3+ data models)
→
Algebraic Property Graphs
○ Last year at Data Day...
○ A Graph is a Graph is a Graph
○ Composable and bidirectional mappings
○ Formal property graph data model
○ Taxonomy of graph elements
○ Use category theory for the model
○ Developed with Ryan Wisnesky (Conexus AI)
○ Implementations in Haskell and CQL
○ Minimal cover for enterprise data
○ Analogous features in graph and non-graph data models
EIements, labels, values, and types
Graph transformations
Building structure APIs
○ Vertices, edges, and properties
○ Special cases that can be derived from the type system
○ Graphs are different
○ Not described in terms of types
○ Graph API is often redundant in TinkerPop3
○ Structure APIs currently written by hand
○ In each language, for each Gremlin Language Variant
○ We can generate consistent interfaces across GLVs
○ Some tooling already exists
○ Build new tools if we want to make it easier
Building process APIs
○ Need abstractions for graph processing
○ Steps, constraints, traversals
○ Freebie: every traversal has a graph representation
○ Graph programs as graph data
○ Generate process APIs for each GLV
○ Using a schema; analogous to generating structure APIs
○ Possible to also generate process implementations?
○ That would be great, but... TBD
○ Code gen options: Haskell? Idris? LLVM? Custom code...
Abstractions for graph processing
○ Gremlin traversals are “like” monadic composition
○ Let’s make them properly monadic
○ Pure functional encapsulation of:
○ Side-effects, transactions, exception handling
○ Learn from existing functional approaches to Gremlin
○ Gremlin-Scala, Greskell, Gremlin-Haskell
Mainstream languages for serialization
Transforming graph data and operations
○ Need a language for schema mappings
○ In theory, that gives us:
○ Automated query rewriting
○ Automated data migration
○ Mix-and-match operations
○ Easy, right...?
Making a smooth transition
○ (TP3 → TP4) ≠ (TP2 → TP3)
○ Large user base, good support for TinkerPop3
○ Q: how do we:
○ Make new features useful to the current community
○ Make the migration to TinkerPop4 as seamless as possible
○ A: we try stuff out
○ “The revolution will be A/B tested”
○ Get involved!
○ gremlin-users@googlegroups.com
○ dev@tinkerpop.apache.org
Thanks!
Joshua Shinavier
joshsh@uber.com
{ }∪{ , , , ...
Stephen Mallette Marko Rodriguez Ketrina Yim Graph community

More Related Content

PPTX
Strategic alignment mode land mc farlan trategic grid
PDF
LKCE19 Klaus Leopold - Flight Levels in Action
PPT
Agile effort estimation
PDF
Lessons from: 2015
PDF
Beyond the Hype: What Generative AI Means for the Future of Work - Damien Cum...
PPTX
Agile transformation by Gnanasambandham anbazhagan
PDF
Agile methodologiesvswaterfall
PPTX
Digital Transformation - Why? How? What?
Strategic alignment mode land mc farlan trategic grid
LKCE19 Klaus Leopold - Flight Levels in Action
Agile effort estimation
Lessons from: 2015
Beyond the Hype: What Generative AI Means for the Future of Work - Damien Cum...
Agile transformation by Gnanasambandham anbazhagan
Agile methodologiesvswaterfall
Digital Transformation - Why? How? What?

What's hot (20)

PDF
Lean Value Tree Overview
PDF
Agile Basics
PPTX
What is agile?
PDF
Innovación Corporativa y Transformación Digital: Portafolio de Innovación - B...
PDF
Bringing Artificial Intelligence Alive
PDF
System of Delivery: An Intro to Our Governance Model
PDF
Lean & Agile Organizational Leadership: History, Theory, Models, & Popular Ideas
PDF
Case solution of Clocky the Run way Alarm clock - Shubham parsekar
PPTX
Accenture Robotics Platform
PDF
Agile Transformation Governance Model
 
PPTX
Summary of The Lean Startup (Eric Ries)
PDF
Marketing-Mayopia
PPTX
Dual Track Agile Or, How I learned to stop worrying and love the scrum
PDF
Clubhouse case study - Product Management Perspective
PPTX
Platforms or Two-sided markets
PDF
Scaling Agile With SAFe (Scaled Agile Framework)
PDF
Looking for Disruptive Business Models in Higher Education
PPT
Agile best practices
PDF
20220607 Introduction to Flight Levels
PPTX
Creating A Culture of Experimentation
Lean Value Tree Overview
Agile Basics
What is agile?
Innovación Corporativa y Transformación Digital: Portafolio de Innovación - B...
Bringing Artificial Intelligence Alive
System of Delivery: An Intro to Our Governance Model
Lean & Agile Organizational Leadership: History, Theory, Models, & Popular Ideas
Case solution of Clocky the Run way Alarm clock - Shubham parsekar
Accenture Robotics Platform
Agile Transformation Governance Model
 
Summary of The Lean Startup (Eric Ries)
Marketing-Mayopia
Dual Track Agile Or, How I learned to stop worrying and love the scrum
Clubhouse case study - Product Management Perspective
Platforms or Two-sided markets
Scaling Agile With SAFe (Scaled Agile Framework)
Looking for Disruptive Business Models in Higher Education
Agile best practices
20220607 Introduction to Flight Levels
Creating A Culture of Experimentation
Ad

Similar to TinkerPop 2020 (20)

PDF
Anything-to-Graph
PDF
TinkerPop: a story of graphs, DBs, and graph DBs
PDF
ACM DBPL Keynote: The Graph Traversal Machine and Language
PDF
Design and Implementation of the Security Graph Language
PDF
A Graph is a Graph is a Graph: Equivalence, Transformation, and Composition o...
PPTX
Data Con LA 2018 - Graph Computing: How the Gremlin Stole Christmas by Justin...
PPTX
Gremlin Queries with DataStax Enterprise Graph
PPTX
Graph databases: Tinkerpop and Titan DB
PDF
Start Flying with Python & Apache TinkerPop
PDF
Find your way in Graph labyrinths
PDF
1st UIM-GDB - Connections to the Real World
PDF
Evolution of the Graph Schema
PDF
Traversing Graphs with Gremlin
PDF
Graph Processing with Apache TinkerPop and Gremlin
PDF
DataStax: What's New in Apache TinkerPop - the Graph Computing Framework
PDF
What's New in Apache TinkerPop - the Graph Computing Framework
PPTX
Large Scale Graph Analytics with JanusGraph
PPTX
Large Scale Graph Analytics with JanusGraph
PDF
A walk in graph databases v1.0
PPTX
Cassandra Summit - What's New In Apache TinkerPop?
Anything-to-Graph
TinkerPop: a story of graphs, DBs, and graph DBs
ACM DBPL Keynote: The Graph Traversal Machine and Language
Design and Implementation of the Security Graph Language
A Graph is a Graph is a Graph: Equivalence, Transformation, and Composition o...
Data Con LA 2018 - Graph Computing: How the Gremlin Stole Christmas by Justin...
Gremlin Queries with DataStax Enterprise Graph
Graph databases: Tinkerpop and Titan DB
Start Flying with Python & Apache TinkerPop
Find your way in Graph labyrinths
1st UIM-GDB - Connections to the Real World
Evolution of the Graph Schema
Traversing Graphs with Gremlin
Graph Processing with Apache TinkerPop and Gremlin
DataStax: What's New in Apache TinkerPop - the Graph Computing Framework
What's New in Apache TinkerPop - the Graph Computing Framework
Large Scale Graph Analytics with JanusGraph
Large Scale Graph Analytics with JanusGraph
A walk in graph databases v1.0
Cassandra Summit - What's New In Apache TinkerPop?
Ad

More from Joshua Shinavier (13)

PDF
Transpilers Gone Wild: Introducing Hydra
PDF
An Algebraic Data Model for Graphs and Hypergraphs (Category Theory meetup, N...
PDF
In Search of the Universal Data Model (ISWC 2019 Minute Madness)
PDF
In Search of the Universal Data Model (Connected Data London 2019)
PPTX
Algebraic Property Graphs (GQL Community Update, oct. 9, 2019)
PDF
Building an Enterprise Knowledge Graph @Uber: Lessons from Reality
PDF
Semantics and Sensors
PDF
semantic markup using schema.org
PDF
The Real-time Web in the Age of Agents
PDF
Linked Process
PDF
Real-time Semantic Web with Twitter Annotations
PDF
Real-time #SemanticWeb in 140 chars
PDF
The state of the art in Linked Data
Transpilers Gone Wild: Introducing Hydra
An Algebraic Data Model for Graphs and Hypergraphs (Category Theory meetup, N...
In Search of the Universal Data Model (ISWC 2019 Minute Madness)
In Search of the Universal Data Model (Connected Data London 2019)
Algebraic Property Graphs (GQL Community Update, oct. 9, 2019)
Building an Enterprise Knowledge Graph @Uber: Lessons from Reality
Semantics and Sensors
semantic markup using schema.org
The Real-time Web in the Age of Agents
Linked Process
Real-time Semantic Web with Twitter Annotations
Real-time #SemanticWeb in 140 chars
The state of the art in Linked Data

Recently uploaded (20)

PDF
Electronic commerce courselecture one. Pdf
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
Big Data Technologies - Introduction.pptx
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Encapsulation theory and applications.pdf
PDF
KodekX | Application Modernization Development
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
Review of recent advances in non-invasive hemoglobin estimation
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PPTX
A Presentation on Artificial Intelligence
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
DOCX
The AUB Centre for AI in Media Proposal.docx
Electronic commerce courselecture one. Pdf
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Big Data Technologies - Introduction.pptx
Mobile App Security Testing_ A Comprehensive Guide.pdf
Encapsulation theory and applications.pdf
KodekX | Application Modernization Development
MYSQL Presentation for SQL database connectivity
Dropbox Q2 2025 Financial Results & Investor Presentation
Chapter 3 Spatial Domain Image Processing.pdf
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Review of recent advances in non-invasive hemoglobin estimation
Digital-Transformation-Roadmap-for-Companies.pptx
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
A Presentation on Artificial Intelligence
The Rise and Fall of 3GPP – Time for a Sabbatical?
Per capita expenditure prediction using model stacking based on satellite ima...
The AUB Centre for AI in Media Proposal.docx

TinkerPop 2020

  • 1. TinkerPop 2020Joshua Shinavier, Ph.D. Global Graph Summit Austin, Texas - January 25th , 2020
  • 2. ○ A brief history of Gremlin ○ Open problems ○ What’s next TinkerPop 2020
  • 3. A brief history of Gremlin
  • 8. TinkerPop 0.x ○ “Making stuff for the fun of it..." ○ From RDF to the property graph data model ○ A Turing complete path language for graphs ○ Ripple! ○ Oh, and Gremlin ○ Blueprints ○ “JDBC for graphs” ○ RDF ←→ PG support added early on
  • 9. TinkerPop 0.x ○ Rexster ○ Server for Blueprints-enabled graphs ○ Predecessor of Gremlin Server ○ Pipes ○ Pull-based dataflow framework ○ Frames ○ Object-oriented graph interfaces using Java annotations
  • 11. TinkerPop 1.x ○ Supported graph back-ends as of May 2012: ○ TinkerGraph (in-memory), Neo4j, OrientDB, DEX (Sparksee) ○ Blueprints adapters for ○ Sesame RDF (RDF4j), JUNG
  • 13. TinkerPop 2.x ○ Furnace ○ Algorithms package built for property graphs ○ Predecessor of graph OLAP in TinkerPop3 ○ New language ecosystems ○ Expansion of functionality on top of Blueprints
  • 15. ○ Complete rewrite of TinkerPop ○ Focus on scale and performance ○ Symmetry between OLTP and OLAP ○ Gremlin becomes more central ○ Git mono-repo ○ Interfaces with not only graph DBs, but graph processors TinkerPop 3.x
  • 16. ○ Not-only-JVM ○ Gremlin in native programming languages ○ Now dozens of graph systems implementing TinkerPop ○ Third-party managed libraries and tools Apache TinkerPop
  • 17. Graph systems ○ Alibaba Graph Database ○ Amazon Neptune ○ ArangoDB ○ Bitsy ○ Blazegraph ○ CosmosDB ○ ChronoGraph ○ DSEGraph ○ GRAKN.AI ○ Hadoop (Spark) ○ HGraphDB ○ Huawei Graph Engine Service ○ IBM Graph ○ JanusGraph ○ Neo4j ○ neo4j-gremlin-bolt ○ OrientDB ○ Apache S2Graph ○ Sqlg ○ Stardog ○ TinkerGraph ○ Titan ○ Titan + Tupl ○ Unipop
  • 18. Query languages, drivers, and GLVs ○ Clojure: ogre ○ Cypher: cypher-for-gremlin ○ Elixir: gremlex ○ Go: grammes, gremgo ○ Haskell: greskell, gremlin-haskell ○ Java: Ferma, gremlin-objects, Peapod, spring-data-gremlin, gremlin-driver ○ JavaScript: gremlin-javascript, gremlin-orm, gremlin-template-string ○ Kotlin: kotlin-gremlin-ogm ○ .NET: Gremlin.Net, Gremlinq ○ PHP: gremlin-php ○ Python: Goblin, gremlin-python, gremlin-py, ipython-gremlin, gremlinclient, gremlin-python, JUGRI, gremlinrestclient, python-gremlin-rest ○ Ruby: gremlin_client ○ Rust: gremlin-rs ○ Scala: gremlin-scala, reactive-gremlin, scalajs-gremlin-client ○ SPARQL: sparql-gremlin ○ SQL: sql-gremlin ○ Typescript: ts-tinkerpop
  • 20. Escape from the JVM ○ TinkerPop originally 100% Java + Groovy ○ Still very JVM-heavy ○ Gremlin-Server is Java-only ○ How to achieve parity across languages? ○ Ideally: complete Gremlin VM in every language ecosystem ○ Code generation? ○ How to generate both: ○ Clean APIs ○ Efficient runtime code ○ ...that fit together?
  • 21. Making life easier for graph providers ○ Creating TinkerPop implementations ○ Currently a monolithic effort for each language / environment ○ How do we: ○ Ensure consistency across implementations? ○ Reduce the workload? ○ Thoughtful test suite ○ Rigorous in terms of correct operations ○ Does not force functionality that may not fit ○ Types and constraints may help
  • 22. Network serialization formats ○ GraphML (XML) ○ Widely supported ○ Graphs only ○ GraphSON (JSON) ○ TinkerPop-specific ○ Graphs, elements, paths, etc. ○ {1.0, 2.0, 3.0} ○ GraphBinary ○ TinkerPop-specific ○ Graphs, elements, paths, etc. ○ Good forward-compatibility ○ Gryo (Kryo) ○ JVM only
  • 23. Network serialization formats ○ GraphML (XML) ○ Widely supported ○ Graphs only ○ GraphSON (JSON) ○ TinkerPop-specific ○ Graphs, elements, paths, etc. ○ {1.0, 2.0, 3.0} ○ GraphBinary ○ TinkerPop-specific ○ Graphs, elements, paths, etc. ○ Good forward-compatibility ○ Gryo (Kryo) ○ JVM only ○ Bit of a format zoo ○ One format to rule them all? ○ Mappings between formats? ○ Will schemas help? ○ How about common RPC formats ○ Thrift, Protobuf, Avro, etc.
  • 24. ○ Property graphs: ○ Strong on intuitiveness ○ Historically weak on schema ○ Lightweight property graph schemas ○ E.g. in JanusGraph, Neo4j, basic Graph.Features ○ Stronger graph schemas ○ RDF triple stores, hypergraph databases, object databases, etc. ○ Schemas facilitate composability of data and queries ○ ...enabling optimizations, mappings, migration, other good stuff ○ What’s the best fit for TinkerPop? Schemas in TinkerPop
  • 25. Getting transactions right ○ How to support diverse transactional models? ○ Neo4j is different than JanusGraph is different than... ○ Is there a unified approach to: ○ Threads + queries + transactions? ○ Transactional scope? ○ Transaction failures? ○ Nested transactions? ○ etc. ○ Will functional approaches to concurrency help?
  • 26. Static analysis for traversals ○ Stop supporting opaque traversals ○ Security issues ○ Portability issues ○ Need a replacement for closures/lambdas ○ “Just write Gremlin” ○ What additional features are required?
  • 27. Graph stream processing ○ Much of the world’s data is streaming ○ Much of that data describes entities and relationships ○ Decades of research on relational stream processing ○ 10+ years on continuous SPARQL ○ What is continuous Gremlin? ○ (RDF)-[:betterThan]->(PG) for streaming ○ RDF stream := unbounded sequence of triples ○ Property graph stream := ? ○ Need schemas, global identifiers, set operations on graphs
  • 28. Abstractions Data models Query languages Formal inference Transformations Embeddings Graph + Relational model Streams ... Human and machine knowledge Knowledge graphs Enterprise Personal Collaborative Mental representations Representation learning Visualization and HCI ... Processing and performance Graph... Ingestion Generation Partitioning Compression Concurrent systems Parallel Distributed Graph analytics Hardware acceleration Benchmarks and metrics ... The 1010 foot view
  • 30. From Graph.Features to a real type system ○ No existing standard for property graphs ○ Recent community efforts ○ W3C Workshop on Web Standardization for Graph Data (March 2019) ○ Property Graph Schema Working Group (PGSWG) ○ Graph Query Language (GQL) ○ Don’t forget about external data models ○ Relational model ○ RDF and other graph models ○ Data interchange formats (Protocol Buffers, Thrift, Avro, etc.) ○ OO, ER, and semistructured data models
  • 31. Taming the dragon (connecting 3+ data models)
  • 32. Taming the dragon (connecting 3+ data models) →
  • 33. Algebraic Property Graphs ○ Last year at Data Day... ○ A Graph is a Graph is a Graph ○ Composable and bidirectional mappings ○ Formal property graph data model ○ Taxonomy of graph elements ○ Use category theory for the model ○ Developed with Ryan Wisnesky (Conexus AI) ○ Implementations in Haskell and CQL ○ Minimal cover for enterprise data ○ Analogous features in graph and non-graph data models
  • 36. Building structure APIs ○ Vertices, edges, and properties ○ Special cases that can be derived from the type system ○ Graphs are different ○ Not described in terms of types ○ Graph API is often redundant in TinkerPop3 ○ Structure APIs currently written by hand ○ In each language, for each Gremlin Language Variant ○ We can generate consistent interfaces across GLVs ○ Some tooling already exists ○ Build new tools if we want to make it easier
  • 37. Building process APIs ○ Need abstractions for graph processing ○ Steps, constraints, traversals ○ Freebie: every traversal has a graph representation ○ Graph programs as graph data ○ Generate process APIs for each GLV ○ Using a schema; analogous to generating structure APIs ○ Possible to also generate process implementations? ○ That would be great, but... TBD ○ Code gen options: Haskell? Idris? LLVM? Custom code...
  • 38. Abstractions for graph processing ○ Gremlin traversals are “like” monadic composition ○ Let’s make them properly monadic ○ Pure functional encapsulation of: ○ Side-effects, transactions, exception handling ○ Learn from existing functional approaches to Gremlin ○ Gremlin-Scala, Greskell, Gremlin-Haskell
  • 39. Mainstream languages for serialization
  • 40. Transforming graph data and operations ○ Need a language for schema mappings ○ In theory, that gives us: ○ Automated query rewriting ○ Automated data migration ○ Mix-and-match operations ○ Easy, right...?
  • 41. Making a smooth transition ○ (TP3 → TP4) ≠ (TP2 → TP3) ○ Large user base, good support for TinkerPop3 ○ Q: how do we: ○ Make new features useful to the current community ○ Make the migration to TinkerPop4 as seamless as possible ○ A: we try stuff out ○ “The revolution will be A/B tested” ○ Get involved! ○ gremlin-users@googlegroups.com ○ dev@tinkerpop.apache.org
  • 42. Thanks! Joshua Shinavier joshsh@uber.com { }∪{ , , , ... Stephen Mallette Marko Rodriguez Ketrina Yim Graph community