SlideShare a Scribd company logo
© 2022 Neo4j, Inc. All rights reserved.
© 2022 Neo4j, Inc. All rights reserved.
How to Import JSON Using
Cypher and APOC
Eric Monk,
Principal Solutions Engineer
© 2022 Neo4j, Inc. All rights reserved.
2
JSON Data Overview
© 2022 Neo4j, Inc. All rights reserved.
Data we will use
• Start with :play movies
• Run Cypher that creates the sample data
◦ Movie, Person
◦ ACTED_IN, DIRECTED, …
• Using Cypher + APOC the following files have been exported
◦ play-movies-db.json
◦ play-movies-roles.json
• Files exist in the <db_root>/import directory
3
© 2022 Neo4j, Inc. All rights reserved.
DB Export Cypher
WITH "
MATCH (p:Person)-[r:ACTED_IN]->(m:Movie)
WHERE r.roles IS NOT NULL
WITH apoc.coll.sort(apoc.coll.toSet(apoc.coll.flatten(collect(r.roles)))) as allRoles
UNWIND range(1, size(allRoles), 1) as index
WITH apoc.map.fromPairs(collect([allRoles[index], index])) as roleIndexMap
MATCH (m:Movie)
WITH m,
[(m)<-[r:ACTED_IN]-(person:Person) | {
roleIndexes: [x IN r.roles | roleIndexMap[x]],
person: person { .name, .born }
}] as actors,
[(m)<-[:DIRECTED]-(person:Person) | person { .name, .born }] as directors,
[(m)<-[:PRODUCED]-(person:Person) | person { .name, .born }] as producers,
[(m)<-[:WROTE]-(person:Person) | person { .name, .born }] as writers,
[(m)<-[r:REVIEWED]-(person:Person) | {
review: r { .rating, .summary },
person: person { .name, .born }
}] as reviewers
4
Converting
roles to
indexes
makes import
more difficult
Relationships
with
properties
have deeper
nesting
© 2022 Neo4j, Inc. All rights reserved.
DB Export Cypher (continued)
WITH apoc.map.merge({
title: m.title,
tagline: m.tagline,
released: m.released,
actors: actors,
producers: producers,
directors: directors,
writers: writers
}, CASE WHEN size(reviewers) = 0 THEN {} ELSE { reviewers: reviewers } END) as movieJson
RETURN movieJson
" as query
CALL apoc.export.json.query(query, "play-movies-db.json")
YIELD file, source, format, nodes, relationships, properties, time, rows, batchSize, batches, done, data
RETURN file, source, format, nodes, relationships, properties, time, rows, batchSize, batches, done,
data
Note: apoc.export.file.enabled=true needs to in neo4j.conf
5
reviewers is
optionally
present
© 2022 Neo4j, Inc. All rights reserved.
6
DB JSON
{"movieJson":{"actors":[{"person":{"born":1978,"name":"Emil
Eifrem"},"roleIndexes":[55]},...
{"movieJson":{"actors":[{"person":{"born":1960,"name":"Hugo
Weaving"},"roleIndexes":[4]},...
{"movieJson":{"actors":[{"person":{"born":1960,"name":"Hugo
Weaving"},"roleIndexes":[4]},...
{"movieJson":{"actors":[{"person":{"born":1940,"name":"Al
Pacino"},"roleIndexes":[99]},...
{"movieJson":{"actors":[{"person":{"born":1967,"name":"James
Marshall"},"roleIndexes":[141]},...
{"movieJson":{"actors":[{"person":{"born":1959,"name":"Val
Kilmer"},"roleIndexes":[80]},...
{"movieJson":{"actors":[{"person":{"born":1974,"name":"Jerry
O'Connell"},"roleIndexes":[62]},...
We have a
file with a
JSON string
on each line
© 2022 Neo4j, Inc. All rights reserved.
7
DB JSON Example 1
{
"movieJson": {
"actors": [
{
"person": { "born": 1960, "name": "Hugo Weaving" },
"roleIndexes": [4]
},
...
],
"directors": [
{ "born": 1965, "name": "Lana Wachowski" },
{ "born": 1967, "name": "Lilly Wachowski" }
],
"tagline": "Welcome to the Real World",
"writers": [],
"title": "The Matrix",
"released": 1999,
"producers": [{ "born": 1952, "name": "Joel Silver" }]
}
}
We'll need to
look this up
before writing
reviewers is
missing
© 2022 Neo4j, Inc. All rights reserved.
8
DB JSON Example 2
{
"movieJson": {
"actors": [
{
"person": { "born": 1974, "name": "Jerry O'Connell" },
"roleIndexes": [62]
},
...
],
"directors": [{ "born": 1957, "name": "Cameron Crowe" }],
"tagline": "The rest of his life begins now.",
"writers": [{ "born": 1957, "name": "Cameron Crowe" }],
"title": "Jerry Maguire",
"reviewers": [
{
"person": { "born": null, "name": "Jessica Thompson" },
"review": { "summary": "You had me at Jerry", "rating": 92 }
}
],
"released": 2000,
"producers": [{ "born": 1957, "name": "Cameron Crowe" }]
}
}
reviewers is
present
© 2022 Neo4j, Inc. All rights reserved.
Roles Export Cypher
WITH "
MATCH (p:Person)-[r:ACTED_IN]->(m:Movie)
WHERE r.roles IS NOT NULL
WITH apoc.coll.sort(apoc.coll.toSet(apoc.coll.flatten(collect(r.roles)))) as
allRoles
UNWIND range(1, size(allRoles), 1) as index
WITH collect({roleId: index, role: allRoles[index]}) as allRoles
RETURN { roles: allRoles } as rolesJson
" as query
CALL apoc.export.json.query(query, "play-movies-roles.json")
YIELD file, source, format, nodes, relationships, properties, time, rows,
batchSize, batches, done, data
RETURN file, source, format, nodes, relationships, properties, time, rows,
batchSize, batches, done, data
9
Use a roleId
to identify a
role
© 2022 Neo4j, Inc. All rights reserved.
10
Play Movies Roles JSON
{
"rolesJson": {
"roles": [
{ "role": ""Wild Bill" Wharton", "roleId": 1 },
{ "role": "Ace Merrill", "roleId": 2 },
{ "role": "Admiral", "roleId": 3 },
{ "role": "Agent Smith", "roleId": 4 },
{ "role": "Albert Goldman", "roleId": 5 },
...more roles…
{ "role": "Walter", "roleId": 181 },
{ "role": "Warden Hal Moores", "roleId": 182 },
{ "role": "Zachry", "roleId": 183 },
{ "role": null, "roleId": 184 }
]
}
}
We have a
null we'll
have to deal
with
© 2022 Neo4j, Inc. All rights reserved.
11
JSON Exploration
© 2022 Neo4j, Inc. All rights reserved.
Exploring the JSON
• Use apoc.load.jsonArray(arg1, arg2)
• Arg1 - path to JSON file
• Arg2 - (optional) JSON path to fetch nested data
Example
CALL apoc.load.jsonArray('file:///play-movies-db.json', '$.movieJson')
Note
apoc.import.file.enabled=true needs to in neo4j.conf
12
file:/// points to /import directory
© 2022 Neo4j, Inc. All rights reserved.
Run Examples in Neo4j Browser
• E1: Examine Movies JSON
• E2: Examine Roles JSON
• E3: Select Movies JSON w/Path
• E4: Select Roles JSON w/Path
• E5: Look at Movie Properties
13
© 2022 Neo4j, Inc. All rights reserved.
14
JSON Import
© 2022 Neo4j, Inc. All rights reserved.
Loading from JSON
We want to…
• CREATE/MERGE nodes
• CREATE/MERGE relationships
• Set node and relationship properties
• Handle optional data
• Lookup data from another file
15
© 2022 Neo4j, Inc. All rights reserved.
L1: Load Movies
CALL apoc.load.jsonArray('file:///play-movies-db.json',
'$.movieJson') YIELD value
WITH value as movieData
MERGE (m:Movie {title: movieData.title})
SET m += {
tagline: movieData.tagline,
released: movieData.released // <-- note you could do
something like toInteger() here but in this case it's already a
number
}
16
MERGE
Movie on
title
Set properties
Convert /
data quality
check if
needed
© 2022 Neo4j, Inc. All rights reserved.
L2: Load Directors
// L2: Load Directors
// ...previous Cypher...
FOREACH (directorInfo IN movieData.directors |
MERGE (p:Person {name: directorInfo.name})
SET p.born = directorInfo.born
MERGE (p)-[:DIRECTED]->(m)
)
17
Use FOREACH
loops through directors array
MERGE Person on name
Set born property
Add DIRECTED relationship
© 2022 Neo4j, Inc. All rights reserved.
L3: Add Writers and Producers
// L3: Add writers and producers
// ...previous Cypher...
// writers
FOREACH (personInfo IN movieData.writers |
MERGE (p:Person {name: personInfo.name})
SET p.born = personInfo.born
MERGE (p)-[:WROTE]->(m)
)
// producers
FOREACH (personInfo IN movieData.producers |
MERGE (p:Person {name: personInfo.name})
SET p.born = personInfo.born
MERGE (p)-[:PRODUCED]->(m)
)
18
Using same pattern as before
loops through arrays
MERGE nodes
set properties
create relationships
© 2022 Neo4j, Inc. All rights reserved.
L4.1: Add Reviewers - Option 1
// L4.1: Add Reviewers - Option 1
// ...previous Cypher...
// reviewers - option 1
FOREACH (reviewer IN movieData.reviewers |
MERGE (p:Person {name: reviewer.person.name})
SET p.born = reviewer.person.born
MERGE (p)-[r:REVIEWED]->(m)
SET r += reviewer.review
)
Works because no additional processing required,
but that's not always the case…
19
Using same pattern as before
Note extra level of nesting
reviewer.person.name
reviewer.person.born
© 2022 Neo4j, Inc. All rights reserved.
L4.2: Add Reviewers - Option 2
// ...previous Cypher...
// L4: Add Reviewers - Option 2
WITH m, movieData
CALL apoc.do.when(movieData.reviewers IS NOT NULL,
"
UNWIND reviewers as reviewer
MERGE (p:Person {name: reviewer.person.name})
SET p.born = reviewer.person.born
MERGE (p)-[r:REVIEWED]->(m)
SET r += reviewer.review
WITH collect(reviewer) as _
RETURN 1 as result
",
"
RETURN 0 as result
", { m: m, reviewers: movieData.reviewers }) YIELD value
20
Use apoc.do.when for
conditional processing
If NOT NULL, perform logic to
MERGE, SET
If NULL, do nothing
© 2022 Neo4j, Inc. All rights reserved.
L5.1: Return Roles as a Map
// L5.1 - Return Roles as a Map
WITH {
jsonFile: 'file:///play-movies-roles.json',
jsonPath: '$.rolesJson.roles'
} as params
CALL apoc.load.jsonArray(params.jsonFile,
params.jsonPath) YIELD value
RETURN apoc.map.fromPairs(collect([value.roleId,
value.role])) as roleMap
Result
{
"1": ""Wild Bill" Wharton",
"2": "Ace Merrill",
"3": "Admiral",
...
}
21
Use apoc.map.fromPairs
to create a map
© 2022 Neo4j, Inc. All rights reserved.
L5.2: Load Roles First
// L5.2 - Load Actors and Roles
WITH {
jsonFile: 'file:///play-movies-db.json',
jsonPath: '$.movieJson',
roleJsonFile: 'file:///play-movies-roles.json',
roleJsonPath: '$.rolesJson.roles'
} as params
CALL apoc.load.jsonArray(params.roleJsonFile,
params.roleJsonPath) YIELD value
WITH params, apoc.map.fromPairs(collect([value.roleId,
value.role])) as roleMap
CALL apoc.load.jsonArray(params.jsonFile,
params.jsonPath) YIELD value
WITH params, roleMap, value as movieData
22
Load roles JSON
Create roleMap
Load movies JSON
Pass roleMap in WITH
© 2022 Neo4j, Inc. All rights reserved.
L5.2: Load Actors
// ...previous Cypher…
// actors
FOREACH (actor IN movieData.actors |
MERGE (p:Person {name: actor.person.name})
SET p.born = actor.person.born
MERGE (p)-[r:ACTED_IN]->(m)
SET r += {
roles: [x IN
[roleIndex IN actor.roleIndexes |
roleMap[toString(roleIndex)]]
WHERE x IS NOT NULL | x
]
}
)
23
For each roleIndex use
roleMap to get the role
string
Filter out NULL by
wrapping it another list
comprehension
© 2022 Neo4j, Inc. All rights reserved.
24
Summary
© 2022 Neo4j, Inc. All rights reserved.
Summary
• Use apoc.load.jsonArray() to load JSON
• MERGE create nodes and relationships, SET to set properties
• FOREACH to process JSON lists
• Use apoc.do.when for conditional processing
• Convert "lookup" JSON file to map, and make it available via WITH
25
© 2022 Neo4j, Inc. All rights reserved.
© 2022 Neo4j, Inc. All rights reserved.
26
Thank you!
Contact us at
sales@neo4j.com

More Related Content

PDF
nginx 입문 공부자료
PPSX
SOLID Principles and The Clean Architecture
PPTX
Hydra: A Vocabulary for Hypermedia-Driven Web APIs
PDF
Neo4j Presentation
PDF
Airflow at lyft for Airflow summit 2020 conference
PPT
TypeScript - Javascript done right
PDF
FIWARE Training: JSON-LD and NGSI-LD
PDF
ReactorKit으로 단방향 반응형 앱 만들기
nginx 입문 공부자료
SOLID Principles and The Clean Architecture
Hydra: A Vocabulary for Hypermedia-Driven Web APIs
Neo4j Presentation
Airflow at lyft for Airflow summit 2020 conference
TypeScript - Javascript done right
FIWARE Training: JSON-LD and NGSI-LD
ReactorKit으로 단방향 반응형 앱 만들기

What's hot (20)

PDF
코딩 테스트 및 알고리즘 문제해결 공부 방법 (고려대학교 KUCC, 2022년 4월)
PPT
Understanding REST
PPTX
How Lyft Drives Data Discovery
PDF
Software engineering 101 - The basics you should hear about at least once
PPTX
API Design - 3rd Edition
PPTX
BDD and Behave
PDF
스타트업에서 기술책임자로 살아가기
PDF
[NDC18] 야생의 땅 듀랑고의 데이터 엔지니어링 이야기: 로그 시스템 구축 경험 공유 (2부)
PPTX
Mongo DB Presentation
PDF
인프런 - 스타트업 인프랩 시작 사례
PDF
Firestore: The Basics
PDF
스프링 부트와 로깅
PPTX
Introduction to Apache Cordova (Phonegap)
PDF
JBoss AS / EAP and Java EE6
PDF
Lock-free algorithms for Kotlin Coroutines
PDF
메타버스 서비스에 Android 개발자가 할 일이 있나요?
PDF
An Introduction to the WSO2 API Manager
PPT
Spring Core
PDF
Lock free queue
코딩 테스트 및 알고리즘 문제해결 공부 방법 (고려대학교 KUCC, 2022년 4월)
Understanding REST
How Lyft Drives Data Discovery
Software engineering 101 - The basics you should hear about at least once
API Design - 3rd Edition
BDD and Behave
스타트업에서 기술책임자로 살아가기
[NDC18] 야생의 땅 듀랑고의 데이터 엔지니어링 이야기: 로그 시스템 구축 경험 공유 (2부)
Mongo DB Presentation
인프런 - 스타트업 인프랩 시작 사례
Firestore: The Basics
스프링 부트와 로깅
Introduction to Apache Cordova (Phonegap)
JBoss AS / EAP and Java EE6
Lock-free algorithms for Kotlin Coroutines
메타버스 서비스에 Android 개발자가 할 일이 있나요?
An Introduction to the WSO2 API Manager
Spring Core
Lock free queue
Ad

Similar to How to Import JSON Using Cypher and APOC (20)

PDF
Training Week: Introduction to Neo4j
PPTX
Dockercompose
PPTX
Imdb import presentation
PDF
Workshop Introduction to Neo4j
PPT
Hands on Training – Graph Database with Neo4j
PDF
Introduction to Neo4j - a hands-on crash course
PDF
Training Week: Introduction to Neo4j 2022
PDF
Getting the Most From Today's Java Tooling With Neo4j
PPTX
CSV JSON and XML files in Python.pptx
PDF
RSpec best practice - avoid using before and let
PPTX
Windy City DB - Recommendation Engine with Neo4j
PDF
Serializing Ruby Objects in Redis
PDF
Intro to Docker for (Rails) Developers
PDF
San Francisco Java User Group
PDF
Neo4j Introduction (Basics, Cypher, RDBMS to GRAPH)
PPT
Application Modeling with Graph Databases
PPTX
Feb 2018 Spinnaker Meetup Reddit Presentation
PDF
The Ring programming language version 1.5.4 book - Part 15 of 185
PDF
js+ts fullstack typescript with react and express.pdf
Training Week: Introduction to Neo4j
Dockercompose
Imdb import presentation
Workshop Introduction to Neo4j
Hands on Training – Graph Database with Neo4j
Introduction to Neo4j - a hands-on crash course
Training Week: Introduction to Neo4j 2022
Getting the Most From Today's Java Tooling With Neo4j
CSV JSON and XML files in Python.pptx
RSpec best practice - avoid using before and let
Windy City DB - Recommendation Engine with Neo4j
Serializing Ruby Objects in Redis
Intro to Docker for (Rails) Developers
San Francisco Java User Group
Neo4j Introduction (Basics, Cypher, RDBMS to GRAPH)
Application Modeling with Graph Databases
Feb 2018 Spinnaker Meetup Reddit Presentation
The Ring programming language version 1.5.4 book - Part 15 of 185
js+ts fullstack typescript with react and express.pdf
Ad

More from Neo4j (20)

PDF
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
PDF
Jin Foo - Prospa GraphSummit Sydney Presentation.pdf
PDF
GraphSummit Singapore Master Deck - May 20, 2025
PPTX
Graphs & GraphRAG - Essential Ingredients for GenAI
PPTX
Neo4j Knowledge for Customer Experience.pptx
PPTX
GraphTalk New Zealand - The Art of The Possible.pptx
PDF
Neo4j: The Art of the Possible with Graph
PDF
Smarter Knowledge Graphs For Public Sector
PDF
GraphRAG and Knowledge Graphs Exploring AI's Future
PDF
Matinée GenAI & GraphRAG Paris - Décembre 24
PDF
ANZ Presentation: GraphSummit Melbourne 2024
PDF
Google Cloud Presentation GraphSummit Melbourne 2024: Building Generative AI ...
PDF
Telstra Presentation GraphSummit Melbourne: Optimising Business Outcomes with...
PDF
Hands-On GraphRAG Workshop: GraphSummit Melbourne 2024
PDF
Démonstration Digital Twin Building Wire Management
PDF
Swiss Life - Les graphes au service de la détection de fraude dans le domaine...
PDF
Démonstration Supply Chain - GraphTalk Paris
PDF
The Art of Possible - GraphTalk Paris Opening Session
PPTX
How Siemens bolstered supply chain resilience with graph-powered AI insights ...
PDF
Knowledge Graphs for AI-Ready Data and Enterprise Deployment - Gartner IT Sym...
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
Jin Foo - Prospa GraphSummit Sydney Presentation.pdf
GraphSummit Singapore Master Deck - May 20, 2025
Graphs & GraphRAG - Essential Ingredients for GenAI
Neo4j Knowledge for Customer Experience.pptx
GraphTalk New Zealand - The Art of The Possible.pptx
Neo4j: The Art of the Possible with Graph
Smarter Knowledge Graphs For Public Sector
GraphRAG and Knowledge Graphs Exploring AI's Future
Matinée GenAI & GraphRAG Paris - Décembre 24
ANZ Presentation: GraphSummit Melbourne 2024
Google Cloud Presentation GraphSummit Melbourne 2024: Building Generative AI ...
Telstra Presentation GraphSummit Melbourne: Optimising Business Outcomes with...
Hands-On GraphRAG Workshop: GraphSummit Melbourne 2024
Démonstration Digital Twin Building Wire Management
Swiss Life - Les graphes au service de la détection de fraude dans le domaine...
Démonstration Supply Chain - GraphTalk Paris
The Art of Possible - GraphTalk Paris Opening Session
How Siemens bolstered supply chain resilience with graph-powered AI insights ...
Knowledge Graphs for AI-Ready Data and Enterprise Deployment - Gartner IT Sym...

Recently uploaded (20)

PDF
Encapsulation theory and applications.pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Empathic Computing: Creating Shared Understanding
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Electronic commerce courselecture one. Pdf
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Network Security Unit 5.pdf for BCA BBA.
PPT
Teaching material agriculture food technology
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
Big Data Technologies - Introduction.pptx
Encapsulation theory and applications.pdf
Unlocking AI with Model Context Protocol (MCP)
Digital-Transformation-Roadmap-for-Companies.pptx
Understanding_Digital_Forensics_Presentation.pptx
Empathic Computing: Creating Shared Understanding
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Electronic commerce courselecture one. Pdf
MYSQL Presentation for SQL database connectivity
Per capita expenditure prediction using model stacking based on satellite ima...
Reach Out and Touch Someone: Haptics and Empathic Computing
Diabetes mellitus diagnosis method based random forest with bat algorithm
Network Security Unit 5.pdf for BCA BBA.
Teaching material agriculture food technology
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
“AI and Expert System Decision Support & Business Intelligence Systems”
MIND Revenue Release Quarter 2 2025 Press Release
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Big Data Technologies - Introduction.pptx

How to Import JSON Using Cypher and APOC

  • 1. © 2022 Neo4j, Inc. All rights reserved. © 2022 Neo4j, Inc. All rights reserved. How to Import JSON Using Cypher and APOC Eric Monk, Principal Solutions Engineer
  • 2. © 2022 Neo4j, Inc. All rights reserved. 2 JSON Data Overview
  • 3. © 2022 Neo4j, Inc. All rights reserved. Data we will use • Start with :play movies • Run Cypher that creates the sample data ◦ Movie, Person ◦ ACTED_IN, DIRECTED, … • Using Cypher + APOC the following files have been exported ◦ play-movies-db.json ◦ play-movies-roles.json • Files exist in the <db_root>/import directory 3
  • 4. © 2022 Neo4j, Inc. All rights reserved. DB Export Cypher WITH " MATCH (p:Person)-[r:ACTED_IN]->(m:Movie) WHERE r.roles IS NOT NULL WITH apoc.coll.sort(apoc.coll.toSet(apoc.coll.flatten(collect(r.roles)))) as allRoles UNWIND range(1, size(allRoles), 1) as index WITH apoc.map.fromPairs(collect([allRoles[index], index])) as roleIndexMap MATCH (m:Movie) WITH m, [(m)<-[r:ACTED_IN]-(person:Person) | { roleIndexes: [x IN r.roles | roleIndexMap[x]], person: person { .name, .born } }] as actors, [(m)<-[:DIRECTED]-(person:Person) | person { .name, .born }] as directors, [(m)<-[:PRODUCED]-(person:Person) | person { .name, .born }] as producers, [(m)<-[:WROTE]-(person:Person) | person { .name, .born }] as writers, [(m)<-[r:REVIEWED]-(person:Person) | { review: r { .rating, .summary }, person: person { .name, .born } }] as reviewers 4 Converting roles to indexes makes import more difficult Relationships with properties have deeper nesting
  • 5. © 2022 Neo4j, Inc. All rights reserved. DB Export Cypher (continued) WITH apoc.map.merge({ title: m.title, tagline: m.tagline, released: m.released, actors: actors, producers: producers, directors: directors, writers: writers }, CASE WHEN size(reviewers) = 0 THEN {} ELSE { reviewers: reviewers } END) as movieJson RETURN movieJson " as query CALL apoc.export.json.query(query, "play-movies-db.json") YIELD file, source, format, nodes, relationships, properties, time, rows, batchSize, batches, done, data RETURN file, source, format, nodes, relationships, properties, time, rows, batchSize, batches, done, data Note: apoc.export.file.enabled=true needs to in neo4j.conf 5 reviewers is optionally present
  • 6. © 2022 Neo4j, Inc. All rights reserved. 6 DB JSON {"movieJson":{"actors":[{"person":{"born":1978,"name":"Emil Eifrem"},"roleIndexes":[55]},... {"movieJson":{"actors":[{"person":{"born":1960,"name":"Hugo Weaving"},"roleIndexes":[4]},... {"movieJson":{"actors":[{"person":{"born":1960,"name":"Hugo Weaving"},"roleIndexes":[4]},... {"movieJson":{"actors":[{"person":{"born":1940,"name":"Al Pacino"},"roleIndexes":[99]},... {"movieJson":{"actors":[{"person":{"born":1967,"name":"James Marshall"},"roleIndexes":[141]},... {"movieJson":{"actors":[{"person":{"born":1959,"name":"Val Kilmer"},"roleIndexes":[80]},... {"movieJson":{"actors":[{"person":{"born":1974,"name":"Jerry O'Connell"},"roleIndexes":[62]},... We have a file with a JSON string on each line
  • 7. © 2022 Neo4j, Inc. All rights reserved. 7 DB JSON Example 1 { "movieJson": { "actors": [ { "person": { "born": 1960, "name": "Hugo Weaving" }, "roleIndexes": [4] }, ... ], "directors": [ { "born": 1965, "name": "Lana Wachowski" }, { "born": 1967, "name": "Lilly Wachowski" } ], "tagline": "Welcome to the Real World", "writers": [], "title": "The Matrix", "released": 1999, "producers": [{ "born": 1952, "name": "Joel Silver" }] } } We'll need to look this up before writing reviewers is missing
  • 8. © 2022 Neo4j, Inc. All rights reserved. 8 DB JSON Example 2 { "movieJson": { "actors": [ { "person": { "born": 1974, "name": "Jerry O'Connell" }, "roleIndexes": [62] }, ... ], "directors": [{ "born": 1957, "name": "Cameron Crowe" }], "tagline": "The rest of his life begins now.", "writers": [{ "born": 1957, "name": "Cameron Crowe" }], "title": "Jerry Maguire", "reviewers": [ { "person": { "born": null, "name": "Jessica Thompson" }, "review": { "summary": "You had me at Jerry", "rating": 92 } } ], "released": 2000, "producers": [{ "born": 1957, "name": "Cameron Crowe" }] } } reviewers is present
  • 9. © 2022 Neo4j, Inc. All rights reserved. Roles Export Cypher WITH " MATCH (p:Person)-[r:ACTED_IN]->(m:Movie) WHERE r.roles IS NOT NULL WITH apoc.coll.sort(apoc.coll.toSet(apoc.coll.flatten(collect(r.roles)))) as allRoles UNWIND range(1, size(allRoles), 1) as index WITH collect({roleId: index, role: allRoles[index]}) as allRoles RETURN { roles: allRoles } as rolesJson " as query CALL apoc.export.json.query(query, "play-movies-roles.json") YIELD file, source, format, nodes, relationships, properties, time, rows, batchSize, batches, done, data RETURN file, source, format, nodes, relationships, properties, time, rows, batchSize, batches, done, data 9 Use a roleId to identify a role
  • 10. © 2022 Neo4j, Inc. All rights reserved. 10 Play Movies Roles JSON { "rolesJson": { "roles": [ { "role": ""Wild Bill" Wharton", "roleId": 1 }, { "role": "Ace Merrill", "roleId": 2 }, { "role": "Admiral", "roleId": 3 }, { "role": "Agent Smith", "roleId": 4 }, { "role": "Albert Goldman", "roleId": 5 }, ...more roles… { "role": "Walter", "roleId": 181 }, { "role": "Warden Hal Moores", "roleId": 182 }, { "role": "Zachry", "roleId": 183 }, { "role": null, "roleId": 184 } ] } } We have a null we'll have to deal with
  • 11. © 2022 Neo4j, Inc. All rights reserved. 11 JSON Exploration
  • 12. © 2022 Neo4j, Inc. All rights reserved. Exploring the JSON • Use apoc.load.jsonArray(arg1, arg2) • Arg1 - path to JSON file • Arg2 - (optional) JSON path to fetch nested data Example CALL apoc.load.jsonArray('file:///play-movies-db.json', '$.movieJson') Note apoc.import.file.enabled=true needs to in neo4j.conf 12 file:/// points to /import directory
  • 13. © 2022 Neo4j, Inc. All rights reserved. Run Examples in Neo4j Browser • E1: Examine Movies JSON • E2: Examine Roles JSON • E3: Select Movies JSON w/Path • E4: Select Roles JSON w/Path • E5: Look at Movie Properties 13
  • 14. © 2022 Neo4j, Inc. All rights reserved. 14 JSON Import
  • 15. © 2022 Neo4j, Inc. All rights reserved. Loading from JSON We want to… • CREATE/MERGE nodes • CREATE/MERGE relationships • Set node and relationship properties • Handle optional data • Lookup data from another file 15
  • 16. © 2022 Neo4j, Inc. All rights reserved. L1: Load Movies CALL apoc.load.jsonArray('file:///play-movies-db.json', '$.movieJson') YIELD value WITH value as movieData MERGE (m:Movie {title: movieData.title}) SET m += { tagline: movieData.tagline, released: movieData.released // <-- note you could do something like toInteger() here but in this case it's already a number } 16 MERGE Movie on title Set properties Convert / data quality check if needed
  • 17. © 2022 Neo4j, Inc. All rights reserved. L2: Load Directors // L2: Load Directors // ...previous Cypher... FOREACH (directorInfo IN movieData.directors | MERGE (p:Person {name: directorInfo.name}) SET p.born = directorInfo.born MERGE (p)-[:DIRECTED]->(m) ) 17 Use FOREACH loops through directors array MERGE Person on name Set born property Add DIRECTED relationship
  • 18. © 2022 Neo4j, Inc. All rights reserved. L3: Add Writers and Producers // L3: Add writers and producers // ...previous Cypher... // writers FOREACH (personInfo IN movieData.writers | MERGE (p:Person {name: personInfo.name}) SET p.born = personInfo.born MERGE (p)-[:WROTE]->(m) ) // producers FOREACH (personInfo IN movieData.producers | MERGE (p:Person {name: personInfo.name}) SET p.born = personInfo.born MERGE (p)-[:PRODUCED]->(m) ) 18 Using same pattern as before loops through arrays MERGE nodes set properties create relationships
  • 19. © 2022 Neo4j, Inc. All rights reserved. L4.1: Add Reviewers - Option 1 // L4.1: Add Reviewers - Option 1 // ...previous Cypher... // reviewers - option 1 FOREACH (reviewer IN movieData.reviewers | MERGE (p:Person {name: reviewer.person.name}) SET p.born = reviewer.person.born MERGE (p)-[r:REVIEWED]->(m) SET r += reviewer.review ) Works because no additional processing required, but that's not always the case… 19 Using same pattern as before Note extra level of nesting reviewer.person.name reviewer.person.born
  • 20. © 2022 Neo4j, Inc. All rights reserved. L4.2: Add Reviewers - Option 2 // ...previous Cypher... // L4: Add Reviewers - Option 2 WITH m, movieData CALL apoc.do.when(movieData.reviewers IS NOT NULL, " UNWIND reviewers as reviewer MERGE (p:Person {name: reviewer.person.name}) SET p.born = reviewer.person.born MERGE (p)-[r:REVIEWED]->(m) SET r += reviewer.review WITH collect(reviewer) as _ RETURN 1 as result ", " RETURN 0 as result ", { m: m, reviewers: movieData.reviewers }) YIELD value 20 Use apoc.do.when for conditional processing If NOT NULL, perform logic to MERGE, SET If NULL, do nothing
  • 21. © 2022 Neo4j, Inc. All rights reserved. L5.1: Return Roles as a Map // L5.1 - Return Roles as a Map WITH { jsonFile: 'file:///play-movies-roles.json', jsonPath: '$.rolesJson.roles' } as params CALL apoc.load.jsonArray(params.jsonFile, params.jsonPath) YIELD value RETURN apoc.map.fromPairs(collect([value.roleId, value.role])) as roleMap Result { "1": ""Wild Bill" Wharton", "2": "Ace Merrill", "3": "Admiral", ... } 21 Use apoc.map.fromPairs to create a map
  • 22. © 2022 Neo4j, Inc. All rights reserved. L5.2: Load Roles First // L5.2 - Load Actors and Roles WITH { jsonFile: 'file:///play-movies-db.json', jsonPath: '$.movieJson', roleJsonFile: 'file:///play-movies-roles.json', roleJsonPath: '$.rolesJson.roles' } as params CALL apoc.load.jsonArray(params.roleJsonFile, params.roleJsonPath) YIELD value WITH params, apoc.map.fromPairs(collect([value.roleId, value.role])) as roleMap CALL apoc.load.jsonArray(params.jsonFile, params.jsonPath) YIELD value WITH params, roleMap, value as movieData 22 Load roles JSON Create roleMap Load movies JSON Pass roleMap in WITH
  • 23. © 2022 Neo4j, Inc. All rights reserved. L5.2: Load Actors // ...previous Cypher… // actors FOREACH (actor IN movieData.actors | MERGE (p:Person {name: actor.person.name}) SET p.born = actor.person.born MERGE (p)-[r:ACTED_IN]->(m) SET r += { roles: [x IN [roleIndex IN actor.roleIndexes | roleMap[toString(roleIndex)]] WHERE x IS NOT NULL | x ] } ) 23 For each roleIndex use roleMap to get the role string Filter out NULL by wrapping it another list comprehension
  • 24. © 2022 Neo4j, Inc. All rights reserved. 24 Summary
  • 25. © 2022 Neo4j, Inc. All rights reserved. Summary • Use apoc.load.jsonArray() to load JSON • MERGE create nodes and relationships, SET to set properties • FOREACH to process JSON lists • Use apoc.do.when for conditional processing • Convert "lookup" JSON file to map, and make it available via WITH 25
  • 26. © 2022 Neo4j, Inc. All rights reserved. © 2022 Neo4j, Inc. All rights reserved. 26 Thank you! Contact us at sales@neo4j.com