SlideShare a Scribd company logo
Graph databases and the Panama Papers - Stefan Armbruster - Codemotion Milan 2016
Graph databases and the Panama Papers - Stefan Armbruster - Codemotion Milan 2016
Graph databases and the Panama Papers - Stefan Armbruster - Codemotion Milan 2016
•
•
•
•
•
•
•
Graph databases and the Panama Papers - Stefan Armbruster - Codemotion Milan 2016
Graph databases and the Panama Papers - Stefan Armbruster - Codemotion Milan 2016
+190 journalists in
more than 65 countries
12 staff members (USA, Costa Rica, Venezuela, Germany, France, Spain)
50% of the team = Data & Research Unit
raw files
metadata
author; sender...
database
search and
discovery
raw text
3 million files
x
10 seconds per
file
=
1 yr / 35 servers
= 1.5 weeks
Lucene syntax
queries with
proximity matching!
400
users
●
●
●
●
●
●
●
●
Graph databases and the Panama Papers - Stefan Armbruster - Codemotion Milan 2016
Graph databases and the Panama Papers - Stefan Armbruster - Codemotion Milan 2016
Graph databases and the Panama Papers - Stefan Armbruster - Codemotion Milan 2016
Graph databases and the Panama Papers - Stefan Armbruster - Codemotion Milan 2016
Graph databases and the Panama Papers - Stefan Armbruster - Codemotion Milan 2016
•
•
•
•
•
Graph databases and the Panama Papers - Stefan Armbruster - Codemotion Milan 2016
•
•
•
•
•
•
•
•
•
Graph databases and the Panama Papers - Stefan Armbruster - Codemotion Milan 2016
Graph databases and the Panama Papers - Stefan Armbruster - Codemotion Milan 2016
Graph databases and the Panama Papers - Stefan Armbruster - Codemotion Milan 2016
Graph databases and the Panama Papers - Stefan Armbruster - Codemotion Milan 2016
Graph databases and the Panama Papers - Stefan Armbruster - Codemotion Milan 2016
Graph databases and the Panama Papers - Stefan Armbruster - Codemotion Milan 2016
•
•
LOAD CSV WITH HEADERS FROM "url" AS row
MERGE (:Person {name:row.name,
age:toInt(row.age)});
•
•
•
CALL apoc.load.json("url") yield value as doc
UNWIND doc.items as item
MERGE (:Contract {title:item.title,
amount:toFloat(item.amount)});
•
•
•
bin/neo4j-import –-into people.db
--nodes:Person people.csv
--nodes:Company companies.csv
--relationship:STAKEHOLDER stakeholders.csv
Graph databases and the Panama Papers - Stefan Armbruster - Codemotion Milan 2016
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
Graph databases and the Panama Papers - Stefan Armbruster - Codemotion Milan 2016
•
•
•
•
•
•
•
•
•
•
•
•
Graph databases and the Panama Papers - Stefan Armbruster - Codemotion Milan 2016
Graph databases and the Panama Papers - Stefan Armbruster - Codemotion Milan 2016
•
•
•
•
•
•
•
•
•
•
•
•
•
Graph databases and the Panama Papers - Stefan Armbruster - Codemotion Milan 2016
Graph databases and the Panama Papers - Stefan Armbruster - Codemotion Milan 2016
Graph databases and the Panama Papers - Stefan Armbruster - Codemotion Milan 2016
Graph databases and the Panama Papers - Stefan Armbruster - Codemotion Milan 2016
•
•
•
•
•
•
•
•
•
•
•
Graph databases and the Panama Papers - Stefan Armbruster - Codemotion Milan 2016
Graph databases and the Panama Papers - Stefan Armbruster - Codemotion Milan 2016
Graph databases and the Panama Papers - Stefan Armbruster - Codemotion Milan 2016
Graph databases and the Panama Papers - Stefan Armbruster - Codemotion Milan 2016
Graph databases and the Panama Papers - Stefan Armbruster - Codemotion Milan 2016
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
…
•
•
•
•
…
•
•
•
•
•
Graph databases and the Panama Papers - Stefan Armbruster - Codemotion Milan 2016
Graph databases and the Panama Papers - Stefan Armbruster - Codemotion Milan 2016
MATCH (boss)-[:MANAGES*0..3]->(sub),
(sub)-[:MANAGES*1..3]->(report)
WHERE boss.name = “John Doe”
RETURN sub.name AS Subordinate,
count(report) AS Total

More Related Content

PDF
SASI, Cassandra on the full text search ride - DuyHai Doan - Codemotion Milan...
PPTX
Search on the fly: how to lighten your Big Data - Simona Russo, Auro Rolle - ...
PDF
Designing and Building a Graph Database Application - Ian Robinson (Neo Techn...
PDF
Евгений Бобров "Powered by OSS. Масштабируемая потоковая обработка и анализ б...
PDF
Meetup070416 Presentations
PDF
Introducing Kafka Connect and Implementing Custom Connectors
PDF
Search Analytics Component: Presented by Steven Bower, Bloomberg L.P.
PPTX
SAMOA: A Platform for Mining Big Data Streams (Apache BigData Europe 2015)
SASI, Cassandra on the full text search ride - DuyHai Doan - Codemotion Milan...
Search on the fly: how to lighten your Big Data - Simona Russo, Auro Rolle - ...
Designing and Building a Graph Database Application - Ian Robinson (Neo Techn...
Евгений Бобров "Powered by OSS. Масштабируемая потоковая обработка и анализ б...
Meetup070416 Presentations
Introducing Kafka Connect and Implementing Custom Connectors
Search Analytics Component: Presented by Steven Bower, Bloomberg L.P.
SAMOA: A Platform for Mining Big Data Streams (Apache BigData Europe 2015)

What's hot (20)

PDF
Building a Real-Time News Search Engine: Presented by Ramkumar Aiyengar, Bloo...
PDF
What's new in pandas and the SciPy stack for financial users
PDF
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
PPTX
Relational to Graph - Import
PDF
Elasticsearch quick Intro (English)
PPTX
Real Time Data Analytics with MongoDB and Fluentd at Wish
PPTX
Webinar: Enterprise Data Management in the Era of MongoDB and Data Lakes
PDF
Presto as a Service - Tips for operation and monitoring
PPTX
Open Source Big Data Ingestion - Without the Heartburn!
PDF
Big Data, Mob Scale.
PDF
Rental Cars and Industrialized Learning to Rank with Sean Downes
PPTX
Getting Started with MongoDB Using the Microsoft Stack
PDF
Real-Time Spark: From Interactive Queries to Streaming
PDF
Using MongoDB + Hadoop Together
PDF
Efficient Scalable Search in a Multi-Tenant Environment: Presented by Harry H...
PDF
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...
PDF
Thoth - Real-time Solr Monitor and Search Analysis Engine: Presented by Damia...
PDF
Apache Spark and MongoDB - Turning Analytics into Real-Time Action
KEY
Cascalog
PDF
Elastic{ON} 2017 Recap
Building a Real-Time News Search Engine: Presented by Ramkumar Aiyengar, Bloo...
What's new in pandas and the SciPy stack for financial users
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
Relational to Graph - Import
Elasticsearch quick Intro (English)
Real Time Data Analytics with MongoDB and Fluentd at Wish
Webinar: Enterprise Data Management in the Era of MongoDB and Data Lakes
Presto as a Service - Tips for operation and monitoring
Open Source Big Data Ingestion - Without the Heartburn!
Big Data, Mob Scale.
Rental Cars and Industrialized Learning to Rank with Sean Downes
Getting Started with MongoDB Using the Microsoft Stack
Real-Time Spark: From Interactive Queries to Streaming
Using MongoDB + Hadoop Together
Efficient Scalable Search in a Multi-Tenant Environment: Presented by Harry H...
Dan Sullivan - Data Analytics and Text Mining with MongoDB - NoSQL matters Du...
Thoth - Real-time Solr Monitor and Search Analysis Engine: Presented by Damia...
Apache Spark and MongoDB - Turning Analytics into Real-Time Action
Cascalog
Elastic{ON} 2017 Recap
Ad

Viewers also liked (20)

PDF
We started with RoR, C++, C#, nodeJS and... at the end we chose GO - Maurizio...
PDF
Un anno di Front End Meetup! Gioie, dolori e festeggiamenti! - Giacomo Zinett...
PDF
Coding Culture - Sven Peters - Codemotion Milan 2016
PDF
Reactive Thinking in iOS Development - Pedro Piñera Buendía - Codemotion Amst...
PDF
Getting developers hooked on your API - Nicolas Garnier - Codemotion Amsterda...
PPTX
Impostor syndrome and individual competence - Jessica Rose - Codemotion Amste...
PDF
UGIdotNET Meetup - Andrea Saltarello - Codemotion Milan 2016
PDF
Outthink: machines coping with humans. A journey into the cognitive world - E...
PPTX
Can Super Coders be a reality? - Atreyam Sharma - Codemotion Milan 2016
PDF
Build Apps for Apple Watch - Francesco Novelli - Codemotion Milan 2016
PDF
Living on the Edge (Service): Bundling Microservices to Optimize Consumption ...
PDF
Bias Driven Development - Mario Fusco - Codemotion Milan 2016
PDF
Angular Rebooted: Components Everywhere - Carlo Bonamico, Sonia Pini - Codemo...
PDF
Higher order infrastructure: from Docker basics to cluster management - Nicol...
PPTX
Sviluppare applicazioni cross-platform con Xamarin Forms e il framework Prism...
PDF
Il Bot di Codemotion - Emanuele Capparelli - Codemotion Milan 2016
PPTX
Cross-platform Apps using Xamarin and MvvmCross - Martijn van Dijk - Codemoti...
PDF
Large Scale Refactoring at Trivago - Christoph Reinartz - Codemotion Amsterda...
PPTX
Sviluppare applicazioni nell'era dei "Big Data" con Scala e Spark - Mario Car...
PPTX
Combining AI and IoT. New Industrial Revolution in our houses and in the Univ...
We started with RoR, C++, C#, nodeJS and... at the end we chose GO - Maurizio...
Un anno di Front End Meetup! Gioie, dolori e festeggiamenti! - Giacomo Zinett...
Coding Culture - Sven Peters - Codemotion Milan 2016
Reactive Thinking in iOS Development - Pedro Piñera Buendía - Codemotion Amst...
Getting developers hooked on your API - Nicolas Garnier - Codemotion Amsterda...
Impostor syndrome and individual competence - Jessica Rose - Codemotion Amste...
UGIdotNET Meetup - Andrea Saltarello - Codemotion Milan 2016
Outthink: machines coping with humans. A journey into the cognitive world - E...
Can Super Coders be a reality? - Atreyam Sharma - Codemotion Milan 2016
Build Apps for Apple Watch - Francesco Novelli - Codemotion Milan 2016
Living on the Edge (Service): Bundling Microservices to Optimize Consumption ...
Bias Driven Development - Mario Fusco - Codemotion Milan 2016
Angular Rebooted: Components Everywhere - Carlo Bonamico, Sonia Pini - Codemo...
Higher order infrastructure: from Docker basics to cluster management - Nicol...
Sviluppare applicazioni cross-platform con Xamarin Forms e il framework Prism...
Il Bot di Codemotion - Emanuele Capparelli - Codemotion Milan 2016
Cross-platform Apps using Xamarin and MvvmCross - Martijn van Dijk - Codemoti...
Large Scale Refactoring at Trivago - Christoph Reinartz - Codemotion Amsterda...
Sviluppare applicazioni nell'era dei "Big Data" con Scala e Spark - Mario Car...
Combining AI and IoT. New Industrial Revolution in our houses and in the Univ...
Ad

Similar to Graph databases and the Panama Papers - Stefan Armbruster - Codemotion Milan 2016 (20)

PDF
Graph databases and the #panamapapers
PDF
Neo4j GraphTalks Panama Papers
PPTX
The Inside Scoop on Neo4j: Meet the Builders
PDF
Data modeling with neo4j tutorial
PDF
Practical Graph Algorithms with Neo4j
PDF
Open data with Neo4j and Kotlin
PPTX
How Graphs Help Investigative Journalists to Connect the Dots
PPT
Importing life science at a into Neo4j
PPT
Hands on Training – Graph Database with Neo4j
PDF
Knowledge Graphs and Graph Data Science: More Context, Better Predictions (Ne...
PDF
Neo4j: The path to success with Graph Database and Graph Data Science
PPTX
Graphs fun vjug2
PDF
Training Week: Introduction to Neo4j
PDF
New opportunities for connected data
PDF
Neo4j: Graph-like power
PDF
RadioBOSS Advanced 7.0.8 Free Download
PDF
Evernote 10.132.4.49891 With Crack free
PDF
Roadmap y Novedades de producto
PDF
Adobe Photoshop 2025 Free crack Download
Graph databases and the #panamapapers
Neo4j GraphTalks Panama Papers
The Inside Scoop on Neo4j: Meet the Builders
Data modeling with neo4j tutorial
Practical Graph Algorithms with Neo4j
Open data with Neo4j and Kotlin
How Graphs Help Investigative Journalists to Connect the Dots
Importing life science at a into Neo4j
Hands on Training – Graph Database with Neo4j
Knowledge Graphs and Graph Data Science: More Context, Better Predictions (Ne...
Neo4j: The path to success with Graph Database and Graph Data Science
Graphs fun vjug2
Training Week: Introduction to Neo4j
New opportunities for connected data
Neo4j: Graph-like power
RadioBOSS Advanced 7.0.8 Free Download
Evernote 10.132.4.49891 With Crack free
Roadmap y Novedades de producto
Adobe Photoshop 2025 Free crack Download

More from Codemotion (20)

PDF
Fuzz-testing: A hacker's approach to making your code more secure | Pascal Ze...
PDF
Pompili - From hero to_zero: The FatalNoise neverending story
PPTX
Pastore - Commodore 65 - La storia
PPTX
Pennisi - Essere Richard Altwasser
PPTX
Michel Schudel - Let's build a blockchain... in 40 minutes! - Codemotion Amst...
PPTX
Richard Süselbeck - Building your own ride share app - Codemotion Amsterdam 2019
PPTX
Eward Driehuis - What we learned from 20.000 attacks - Codemotion Amsterdam 2019
PPTX
Francesco Baldassarri - Deliver Data at Scale - Codemotion Amsterdam 2019 -
PDF
Martin Förtsch, Thomas Endres - Stereoscopic Style Transfer AI - Codemotion A...
PDF
Melanie Rieback, Klaus Kursawe - Blockchain Security: Melting the "Silver Bul...
PDF
Angelo van der Sijpt - How well do you know your network stack? - Codemotion ...
PDF
Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...
PDF
Sascha Wolter - Conversational AI Demystified - Codemotion Amsterdam 2019
PDF
Michele Tonutti - Scaling is caring - Codemotion Amsterdam 2019
PPTX
Pat Hermens - From 100 to 1,000+ deployments a day - Codemotion Amsterdam 2019
PPTX
James Birnie - Using Many Worlds of Compute Power with Quantum - Codemotion A...
PDF
Don Goodman-Wilson - Chinese food, motor scooters, and open source developmen...
PDF
Pieter Omvlee - The story behind Sketch - Codemotion Amsterdam 2019
PDF
Dave Farley - Taking Back “Software Engineering” - Codemotion Amsterdam 2019
PDF
Joshua Hoffman - Should the CTO be Coding? - Codemotion Amsterdam 2019
Fuzz-testing: A hacker's approach to making your code more secure | Pascal Ze...
Pompili - From hero to_zero: The FatalNoise neverending story
Pastore - Commodore 65 - La storia
Pennisi - Essere Richard Altwasser
Michel Schudel - Let's build a blockchain... in 40 minutes! - Codemotion Amst...
Richard Süselbeck - Building your own ride share app - Codemotion Amsterdam 2019
Eward Driehuis - What we learned from 20.000 attacks - Codemotion Amsterdam 2019
Francesco Baldassarri - Deliver Data at Scale - Codemotion Amsterdam 2019 -
Martin Förtsch, Thomas Endres - Stereoscopic Style Transfer AI - Codemotion A...
Melanie Rieback, Klaus Kursawe - Blockchain Security: Melting the "Silver Bul...
Angelo van der Sijpt - How well do you know your network stack? - Codemotion ...
Lars Wolff - Performance Testing for DevOps in the Cloud - Codemotion Amsterd...
Sascha Wolter - Conversational AI Demystified - Codemotion Amsterdam 2019
Michele Tonutti - Scaling is caring - Codemotion Amsterdam 2019
Pat Hermens - From 100 to 1,000+ deployments a day - Codemotion Amsterdam 2019
James Birnie - Using Many Worlds of Compute Power with Quantum - Codemotion A...
Don Goodman-Wilson - Chinese food, motor scooters, and open source developmen...
Pieter Omvlee - The story behind Sketch - Codemotion Amsterdam 2019
Dave Farley - Taking Back “Software Engineering” - Codemotion Amsterdam 2019
Joshua Hoffman - Should the CTO be Coding? - Codemotion Amsterdam 2019

Recently uploaded (20)

PPT
Teaching material agriculture food technology
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
A Presentation on Artificial Intelligence
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Empathic Computing: Creating Shared Understanding
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
KodekX | Application Modernization Development
PDF
Encapsulation theory and applications.pdf
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
Teaching material agriculture food technology
Review of recent advances in non-invasive hemoglobin estimation
Spectral efficient network and resource selection model in 5G networks
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
The AUB Centre for AI in Media Proposal.docx
A Presentation on Artificial Intelligence
Chapter 3 Spatial Domain Image Processing.pdf
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Empathic Computing: Creating Shared Understanding
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Building Integrated photovoltaic BIPV_UPV.pdf
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
NewMind AI Monthly Chronicles - July 2025
Unlocking AI with Model Context Protocol (MCP)
KodekX | Application Modernization Development
Encapsulation theory and applications.pdf
Mobile App Security Testing_ A Comprehensive Guide.pdf
“AI and Expert System Decision Support & Business Intelligence Systems”

Graph databases and the Panama Papers - Stefan Armbruster - Codemotion Milan 2016