SlideShare a Scribd company logo
Property graphs with time
Julia Stoyanovich, joint work with Vera Moffitt
Drexel University
Philadelphia, PA USA
stoyanovich.org
openCypher MeetupOctober 25, 2017
openCypher MeetupOctober 25, 2017 2
2008 20092007
20112010
openCypher MeetupOctober 25, 2017 3
https://www.kenedict.com/apples-internal-innovation-network-unraveled-part-1-evolving-networks/
openCypher MeetupOctober 25, 2017 4
https://arxiv.org/abs/1709.06176
openCypher MeetupOctober 25, 2017
Exploratory analysis of evolving graphs
• Which nodes are showing an increasing popularity trend?
• Have any changes in network connectivity been
observed?
• At what time scale can interesting trends be observed?
• How can multiple data sources be used jointly to
complement or corroborate information about network
evolution?
5
openCypher MeetupOctober 25, 2017
Goal
6
Principled and systematics support for usable,
scalable and extensible analysis of evolving graphs
openCypher MeetupOctober 25, 2017
Are Alice and Bill connected?
7
TNGP
… by a path?
openCypher MeetupOctober 25, 2017
Snapshot reducibility
8
openCypher MeetupOctober 25, 2017
Are Alice and Bill connected?
extended snapshot reducibility
9
… by a journey?
… by a path that persists over >2 time instants
openCypher MeetupOctober 25, 2017
TGraph: an evolving property graph
10
openCypher MeetupOctober 25, 2017
TGA: Temporal Graph Algebra
• Temporal variants of standard graph operators + novel time-
specific operators
• Compositional: TGraph (or a pair of TGraphs) as input -
TGraph as output
• Operations maintain model integrity
- graph integrity at each time instant: no dangling edges, a
node/edge appears at most once
- temporal integrity: semantics of temporal operations are
automatically enforced (formally: point semantics)
11
openCypher MeetupOctober 25, 2017
TGA operations
• trim
• temporal versions of
- vertex-map, edge-map
- subgraph, path
- aggregate messages
- union, intersection, difference - binary
• snapshot analytics
- PageRank, connected components,… - Pregel
12
openCypher MeetupOctober 25, 2017
TGA operations
• node creation
• based on temporal window: temporal zoom
• attribute-based: structural zoom
• edge creation
13
openCypher MeetupOctober 25, 2017
Structural zoom
14
add university nodes Drexel and CMU,
and edges between students and these universities
openCypher MeetupOctober 25, 2017
Structural zoom
15
openCypher MeetupOctober 25, 2017
Temporal zoom
16
coarsen taxi trip start-times into 10-min intervals
openCypher MeetupOctober 25, 2017
System architecture
17
Portal
Interactive	Shell
Query	Parser
Spark	
Runtime
GraphX
Data	Structures
Worker
Spark	Runtime
HDFS
Worker
Spark	Runtime
HDFS
…
System	
Catalog
SparkSQL
Portal	Runtime	
(optimizer,	operators,	etc)
Spark 2.0, interoperable with SparkSQL and
with BigDatalog
openCypher MeetupOctober 25, 2017
Physical data representation
• On-disk: Apache Parquet
- vertex / edge files
- broken down into snapshot groups
- each file sorted on start time followed by node /edge id
• In-memory:
- nested relational (Vertex-Edge RDDs)
- GraphX-based: RepresentativeGraphs (RG), One
Graph (OG), HybridGraph (HG)
18
1 2 3
BitSet(p1,p2,p3,p4) BitSet(p2,p3,p4,p5)
BitSet(p5)
BitSet(p1,p2,p3,p4,p5)
BitSet(p2,p3)
openCypher MeetupOctober 25, 2017
Performance highlights
• 16-node Open Stack cluster
• Apache Spark 2.0
• 4 cores, 16GB / RAM per node
19
openCypher MeetupOctober 25, 2017
PageRank on wiki-talk
20
openCypher MeetupOctober 25, 2017
PageRank on nGrams
21
openCypher MeetupOctober 25, 2017
PageRank on Twitter
22
openCypher MeetupOctober 25, 2017
Aggregate messages on wiki-talk
23
openCypher MeetupOctober 25, 2017
Vertex-subgraph on wiki-talk
24
openCypher MeetupOctober 25, 2017
Portal vs. G*
25
average node degree, wiki-talk
openCypher MeetupOctober 25, 2017
Take-aways
• TGraph: a logical model of property graphs with time
• TGA: a compositional temporal graph algebra under
point semantics
• Portal: a library on top of Apache Spark, inter-
operable with SparkSQL
• Ongoing work on a declarative language, multi-
operator query optimization, benchmarking
• Planned open source release this Fall
26
openCypher MeetupOctober 25, 2017
References
• Temporal Graph Algebra, Moffitt & Stoyanovich, DBPL
2017.
• Zooming in on NYC taxi data with Portal, Stoyanovich,
Gilbride and Moffitt, DSSG 2017 (arXiv).
• Towards sequenced semantics for evolving graphs,
Moffitt & Stoyanovich, EDBT 2017.
• Towards a distributed infrastructure for evolving graph
analytics, Moffitt & Stoyanovich, TempWeb 2016.
• Vera Moffitt’s Ph.D. thesis.
27
openCypher MeetupOctober 25, 2017
Thank you!

More Related Content

PPTX
Linked data in the swiss federal data infra
PDF
Time travel and time series analysis with pandas + statsmodels
PDF
Team 04 3 d open land use
PDF
Team 01 using geo dcat ap specification for sharing metadata in geoss and ins...
PPTX
CEDAR & PRELIDA Preservation of Linked Socio-Historical Data
PPT
SC5 Hangout2 pilot 1 description
PDF
Start Flying with Python & Apache TinkerPop
PPTX
SC4 Hangout - Luigi Selmi, Transport pilot architecture
Linked data in the swiss federal data infra
Time travel and time series analysis with pandas + statsmodels
Team 04 3 d open land use
Team 01 using geo dcat ap specification for sharing metadata in geoss and ins...
CEDAR & PRELIDA Preservation of Linked Socio-Historical Data
SC5 Hangout2 pilot 1 description
Start Flying with Python & Apache TinkerPop
SC4 Hangout - Luigi Selmi, Transport pilot architecture

What's hot (19)

PDF
Team 10 geo dcat ap for earth observation data
PDF
Big Data LDN 2017: Your flight is boarding now!
PPTX
Team 09 open land use and smart points of interest visualisation using web g ...
PDF
Seventh openCypher Implementers Group Meeting: Status Update
PPTX
Team 02 metadata catalogue for the open land use map
PDF
Neo4j - Rik Van Bruggen
PDF
Third openCypher Implementers Group Meeting: Status Update
TXT
Plotter
PPTX
BDE SC4 Hangout - Hajira Jabeen, general architecture
PDF
Stair Captions and Stair Actions(ステアラボ人工知能シンポジウム2017)
PDF
Producing Linked Open Data with a Content Management System
PDF
Publishing metadata provenance
PPTX
Károly Kazi: Theory and Practice: BHEs cooperation with educational organizat...
PDF
This week in Neo4j - 28th October 2017
PPTX
Semantic MediaWiki and Open Data
PPTX
03 20170905 inspire workshop_cwa_mondon
PPTX
GPU Computation and the Next Gen Cloud
PDF
Red Hat Summit 2017 - LT107508 - Better Managing your Red Hat footprint with ...
PDF
Presto Summit 2018 - 08 - FINRA
Team 10 geo dcat ap for earth observation data
Big Data LDN 2017: Your flight is boarding now!
Team 09 open land use and smart points of interest visualisation using web g ...
Seventh openCypher Implementers Group Meeting: Status Update
Team 02 metadata catalogue for the open land use map
Neo4j - Rik Van Bruggen
Third openCypher Implementers Group Meeting: Status Update
Plotter
BDE SC4 Hangout - Hajira Jabeen, general architecture
Stair Captions and Stair Actions(ステアラボ人工知能シンポジウム2017)
Producing Linked Open Data with a Content Management System
Publishing metadata provenance
Károly Kazi: Theory and Practice: BHEs cooperation with educational organizat...
This week in Neo4j - 28th October 2017
Semantic MediaWiki and Open Data
03 20170905 inspire workshop_cwa_mondon
GPU Computation and the Next Gen Cloud
Red Hat Summit 2017 - LT107508 - Better Managing your Red Hat footprint with ...
Presto Summit 2018 - 08 - FINRA
Ad

Similar to Property Graphs with Time (20)

PDF
HPC I/O for Computational Scientists
PDF
DLP: a Web-based Facility for Exploration and Basic Modification of Ontologie...
PPTX
Extending DSpace 7: DSpace-CRIS and DSpace-GLAM for empowered repositories an...
PPTX
معرفی کاربردهای یادگیری عمیق و چالش های آن در کلان داده
PPTX
What to Expect of the LSST Archive: The LSST Science Platform
PPTX
CPaaS.io Y1 Review Meeting - Holistic Data Management
PDF
Tds — big science dec 2021
PPTX
Scaling up Linked Data
PPTX
Inspire hack 2017-linked-data
PDF
Team 05 linked data generation
PDF
Delivering Agile Data Science on Openshift - Red Hat Summit 2019
PPTX
Scaling up Linked Data
POTX
ArrayUDF: User-Defined Scientific Data Analysis on Arrays
PDF
Sparql querying of-property-graphs-harsh thakkar-graph day 2017 sf
PDF
Enabling Data Analytics from Knowledge Graphs @ ISWC 2017 Doctoral Consortium
PDF
H2o tutorial
PDF
How to enhance your DSpace repository: use cases for DSpace-CRIS, DSpace-RDM,...
PDF
How to Create the Google for Earth Data (XLDB 2015, Stanford)
PDF
Leveraging Data Driven Research Through Microsoft Azure
PDF
Hala skafkeynote@conferencedata2021
HPC I/O for Computational Scientists
DLP: a Web-based Facility for Exploration and Basic Modification of Ontologie...
Extending DSpace 7: DSpace-CRIS and DSpace-GLAM for empowered repositories an...
معرفی کاربردهای یادگیری عمیق و چالش های آن در کلان داده
What to Expect of the LSST Archive: The LSST Science Platform
CPaaS.io Y1 Review Meeting - Holistic Data Management
Tds — big science dec 2021
Scaling up Linked Data
Inspire hack 2017-linked-data
Team 05 linked data generation
Delivering Agile Data Science on Openshift - Red Hat Summit 2019
Scaling up Linked Data
ArrayUDF: User-Defined Scientific Data Analysis on Arrays
Sparql querying of-property-graphs-harsh thakkar-graph day 2017 sf
Enabling Data Analytics from Knowledge Graphs @ ISWC 2017 Doctoral Consortium
H2o tutorial
How to enhance your DSpace repository: use cases for DSpace-CRIS, DSpace-RDM,...
How to Create the Google for Earth Data (XLDB 2015, Stanford)
Leveraging Data Driven Research Through Microsoft Azure
Hala skafkeynote@conferencedata2021
Ad

More from openCypher (20)

PDF
Learning Timed Automata with Cypher
PDF
Incremental View Maintenance for openCypher Queries
PDF
Formal semantics for Cypher queries and updates
PDF
Cypher.PL: an executable specification of Cypher semantics
PDF
Multiple Graphs: Updatable Views
PDF
Micro-Servicing Linked Data
PDF
Graph abstraction
PDF
From Cypher 9 to GQL: Conceptual overview of multiple named graphs and compos...
PDF
Cypher for Gremlin
PDF
Comparing PGQL, G-Core and Cypher
PDF
Multiple graphs in openCypher
PDF
Eighth openCypher Implementers Group Meeting: Status Update
PDF
Cypher for Gremlin
PDF
Supporting dates and times in Cypher
PDF
Academic research on graph processing: connecting recent findings to industri...
PDF
Cypher.PL: Executable Specification of Cypher written in Prolog
PDF
Use case: processing multiple graphs
PDF
openCypher Technology Compatibility Kit (TCK)
PDF
Cypher Editor in the Web
PDF
The inGraph project and incremental evaluation of Cypher queries
Learning Timed Automata with Cypher
Incremental View Maintenance for openCypher Queries
Formal semantics for Cypher queries and updates
Cypher.PL: an executable specification of Cypher semantics
Multiple Graphs: Updatable Views
Micro-Servicing Linked Data
Graph abstraction
From Cypher 9 to GQL: Conceptual overview of multiple named graphs and compos...
Cypher for Gremlin
Comparing PGQL, G-Core and Cypher
Multiple graphs in openCypher
Eighth openCypher Implementers Group Meeting: Status Update
Cypher for Gremlin
Supporting dates and times in Cypher
Academic research on graph processing: connecting recent findings to industri...
Cypher.PL: Executable Specification of Cypher written in Prolog
Use case: processing multiple graphs
openCypher Technology Compatibility Kit (TCK)
Cypher Editor in the Web
The inGraph project and incremental evaluation of Cypher queries

Recently uploaded (20)

PPTX
Chapter 5: Probability Theory and Statistics
PPTX
Programs and apps: productivity, graphics, security and other tools
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Encapsulation_ Review paper, used for researhc scholars
PPTX
TLE Review Electricity (Electricity).pptx
PDF
DP Operators-handbook-extract for the Mautical Institute
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPTX
A Presentation on Artificial Intelligence
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Mushroom cultivation and it's methods.pdf
PPTX
OMC Textile Division Presentation 2021.pptx
PDF
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
PPTX
SOPHOS-XG Firewall Administrator PPT.pptx
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PDF
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
PDF
Heart disease approach using modified random forest and particle swarm optimi...
PDF
WOOl fibre morphology and structure.pdf for textiles
Chapter 5: Probability Theory and Statistics
Programs and apps: productivity, graphics, security and other tools
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Encapsulation_ Review paper, used for researhc scholars
TLE Review Electricity (Electricity).pptx
DP Operators-handbook-extract for the Mautical Institute
Digital-Transformation-Roadmap-for-Companies.pptx
A Presentation on Artificial Intelligence
MIND Revenue Release Quarter 2 2025 Press Release
Mushroom cultivation and it's methods.pdf
OMC Textile Division Presentation 2021.pptx
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
SOPHOS-XG Firewall Administrator PPT.pptx
Assigned Numbers - 2025 - Bluetooth® Document
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
gpt5_lecture_notes_comprehensive_20250812015547.pdf
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
Heart disease approach using modified random forest and particle swarm optimi...
WOOl fibre morphology and structure.pdf for textiles

Property Graphs with Time

  • 1. Property graphs with time Julia Stoyanovich, joint work with Vera Moffitt Drexel University Philadelphia, PA USA stoyanovich.org openCypher MeetupOctober 25, 2017
  • 2. openCypher MeetupOctober 25, 2017 2 2008 20092007 20112010
  • 3. openCypher MeetupOctober 25, 2017 3 https://www.kenedict.com/apples-internal-innovation-network-unraveled-part-1-evolving-networks/
  • 4. openCypher MeetupOctober 25, 2017 4 https://arxiv.org/abs/1709.06176
  • 5. openCypher MeetupOctober 25, 2017 Exploratory analysis of evolving graphs • Which nodes are showing an increasing popularity trend? • Have any changes in network connectivity been observed? • At what time scale can interesting trends be observed? • How can multiple data sources be used jointly to complement or corroborate information about network evolution? 5
  • 6. openCypher MeetupOctober 25, 2017 Goal 6 Principled and systematics support for usable, scalable and extensible analysis of evolving graphs
  • 7. openCypher MeetupOctober 25, 2017 Are Alice and Bill connected? 7 TNGP … by a path?
  • 8. openCypher MeetupOctober 25, 2017 Snapshot reducibility 8
  • 9. openCypher MeetupOctober 25, 2017 Are Alice and Bill connected? extended snapshot reducibility 9 … by a journey? … by a path that persists over >2 time instants
  • 10. openCypher MeetupOctober 25, 2017 TGraph: an evolving property graph 10
  • 11. openCypher MeetupOctober 25, 2017 TGA: Temporal Graph Algebra • Temporal variants of standard graph operators + novel time- specific operators • Compositional: TGraph (or a pair of TGraphs) as input - TGraph as output • Operations maintain model integrity - graph integrity at each time instant: no dangling edges, a node/edge appears at most once - temporal integrity: semantics of temporal operations are automatically enforced (formally: point semantics) 11
  • 12. openCypher MeetupOctober 25, 2017 TGA operations • trim • temporal versions of - vertex-map, edge-map - subgraph, path - aggregate messages - union, intersection, difference - binary • snapshot analytics - PageRank, connected components,… - Pregel 12
  • 13. openCypher MeetupOctober 25, 2017 TGA operations • node creation • based on temporal window: temporal zoom • attribute-based: structural zoom • edge creation 13
  • 14. openCypher MeetupOctober 25, 2017 Structural zoom 14 add university nodes Drexel and CMU, and edges between students and these universities
  • 15. openCypher MeetupOctober 25, 2017 Structural zoom 15
  • 16. openCypher MeetupOctober 25, 2017 Temporal zoom 16 coarsen taxi trip start-times into 10-min intervals
  • 17. openCypher MeetupOctober 25, 2017 System architecture 17 Portal Interactive Shell Query Parser Spark Runtime GraphX Data Structures Worker Spark Runtime HDFS Worker Spark Runtime HDFS … System Catalog SparkSQL Portal Runtime (optimizer, operators, etc) Spark 2.0, interoperable with SparkSQL and with BigDatalog
  • 18. openCypher MeetupOctober 25, 2017 Physical data representation • On-disk: Apache Parquet - vertex / edge files - broken down into snapshot groups - each file sorted on start time followed by node /edge id • In-memory: - nested relational (Vertex-Edge RDDs) - GraphX-based: RepresentativeGraphs (RG), One Graph (OG), HybridGraph (HG) 18 1 2 3 BitSet(p1,p2,p3,p4) BitSet(p2,p3,p4,p5) BitSet(p5) BitSet(p1,p2,p3,p4,p5) BitSet(p2,p3)
  • 19. openCypher MeetupOctober 25, 2017 Performance highlights • 16-node Open Stack cluster • Apache Spark 2.0 • 4 cores, 16GB / RAM per node 19
  • 20. openCypher MeetupOctober 25, 2017 PageRank on wiki-talk 20
  • 21. openCypher MeetupOctober 25, 2017 PageRank on nGrams 21
  • 22. openCypher MeetupOctober 25, 2017 PageRank on Twitter 22
  • 23. openCypher MeetupOctober 25, 2017 Aggregate messages on wiki-talk 23
  • 24. openCypher MeetupOctober 25, 2017 Vertex-subgraph on wiki-talk 24
  • 25. openCypher MeetupOctober 25, 2017 Portal vs. G* 25 average node degree, wiki-talk
  • 26. openCypher MeetupOctober 25, 2017 Take-aways • TGraph: a logical model of property graphs with time • TGA: a compositional temporal graph algebra under point semantics • Portal: a library on top of Apache Spark, inter- operable with SparkSQL • Ongoing work on a declarative language, multi- operator query optimization, benchmarking • Planned open source release this Fall 26
  • 27. openCypher MeetupOctober 25, 2017 References • Temporal Graph Algebra, Moffitt & Stoyanovich, DBPL 2017. • Zooming in on NYC taxi data with Portal, Stoyanovich, Gilbride and Moffitt, DSSG 2017 (arXiv). • Towards sequenced semantics for evolving graphs, Moffitt & Stoyanovich, EDBT 2017. • Towards a distributed infrastructure for evolving graph analytics, Moffitt & Stoyanovich, TempWeb 2016. • Vera Moffitt’s Ph.D. thesis. 27
  • 28. openCypher MeetupOctober 25, 2017 Thank you!