SlideShare a Scribd company logo
Using Neo4j for exploring the
research graph connections

made by RD-Switchboard
Dr. Amir Aryani (ANDS), and

Dr. Jingbo Wang (NCI)
October 2016
Agenda
• Background: RD-Switchboard & Research Graph
• Neo4j: Queries
• NCI: Graph connections made by RD-Switchboard
using NCI’s metadata
• Q & A
Background
Challenge of Cross-Platform Discovery

of Research Data
{All started here!}
Research Data Australia
Suggested Links
March 2014, Version 12
Data Description Registry
Interoperability (DDRI) Working Group
Research Data Alliance
Goal: enabling cross-platform discovery between
research data infrastructures
Precipitous Growth
RDA Launch /
First Plenary
March 2013
RDA Second
Plenary
September 2014
RDA Third
Plenary
March 2014
RDA Fourth
Plenary
September 2014
RDA Fifth
Plenary
March 2015
Amsterdam,
Netherlands
Washington,
DC, USA Dublin, Ireland
Gothenburg,
Sweden
240 participants
First Working Groups
and Interest Groups
380 participants from 22
countries
First “neutral space”
community meeting (Data
Citation Summit)
First Organizational Partner
Meet-up
First BOFs
497 Participants from
32 countries
First Organizational
Assembly
6 co-located events
14 BOF,
12 Working Groups, 22
Interest Groups
San Diego,
CA, USA
550 Participants
from 40 countries
1st RDA Deliverables
presented
Organizational
Assembly and first
OAB / Council
meeting
10 co-located events
11 BOF,
14 Working Groups,
36 Interest Groups
383 Participants from 30
countries
2nd RDA Deliverables
presented
Organizational Assembly /
Council meetings
1st Adoption Day & Large
scale data projects meeting
10 BOF, 10 Working Groups,
20 Interest Groups;
10 joint Sessions;
4 thematic Plenary
Sessions
Research Data Alliance
June 2016: close to 4,200 members from 110 countries
DDRI WG Approach
Connecting datasets on the basis of
co-authorship or other collaboration
models such as joint funding and
grants.
Research Data Alliance
https://researchdata.ands.org.au/idmm-immunome-
database-for-marsupials-and-monotremes/11139
Using Neo4j for exploring the research graph connections made by RD-Switchboard
Using Neo4j for exploring the research graph connections made by RD-Switchboard
Show 105 more
publications
http://dx.doi.org/10.1371/journal.pone.0079092
One of the 105 articles …
Using Neo4j for exploring the research graph connections made by RD-Switchboard
doi:10.5061/dryad.4qq0v
Authors: Wong ESW, Nichol
S, Warren WC, Belov K
Dryad Dataset
http://datadryad.org/resource/doi:10.5061/dryad.4qq0v
We have found another dataset from
the same author…
Dataset
Researcher
Publication
Dataset
Using machines…
Connecting Datasets by
Three Degrees of Separation
http://researchgraph.org/schema/
More info
http://researchgraph.org/schema
https://github.com/researchgraph/schema
https://github.com/rd-switchboard/Inference
Neo4j
Neo4j Graph Browser
Neo4j Queries
1. Find a Dataset
2. Fina a Publication
3. Find a Grant
4. Find a Researcher
5. Find links to ORCID
6. Find datasets that have DOI
7. Find DOIs using prefix
8. Find highly connected datasets
9. Connections with multiple degrees of separation
10. Find shortest path between two researchers
Find a Dataset
match (n:dataset) where n.doi='10.5524/100166' return n
match (n:dataset) where n.title='The genome of the Australian
dragon lizard Pogona vitticeps' return n
Find a Publication
match (n:publication) where n.doi='10.5170/CERN-2014-008.181'
return n
match (n:cern:publication) where n.title='LHC Results -
Highlights' return n
Find a Grant
match (n:grant) where n.purl='purl.org/au-research/grants/
arc/LP0991658' return n
match (n:grant) where n.title='Hyper-accumulations of
monosulfidic sediments' return n
Find a Researcher
match (n:researcher) where
n.scopus_id='37071260700' return n
match (n:researcher) where n.orcid='0000-0002-7875-2902'
return n
match (n:researcher) where n.last_name='Rajiah' and
n.first_name='Kingston' return n
Find links to ORCID
match (n:dataset:dryad)- -(o:orcid) return count(n)
match (n:dataset:ands)- -(o:orcid) where n.ands_group='The
University of Sydney' return n limit 10
Find Datasets With DOI
match (n:dataset) where exists (n.doi) return count(n)
Find DOIs using Prefix
match (n:dataset) where n.doi=~'10.4225/.*' return n limit 10
Find Highly Connected
Datasets
match (n:ands:dataset)--(x) return n.key, n.title, count(x) order
by count (x) DESC limit 25
Connections with Multiple
Degrees of Separation
match (n:ands:dataset)-[*1..3]-(d:dryad:dataset) return n.title,
d.key limit 25
Find Shortest Path Between
Two Researchers
MATCH p=shortestPath(

(d1:dryad:dataset {doi: '10.5061/dryad.4qq0v'})-[*]-
(d2:ands:dataset {doi:'10.1186/1471-2172-12-48'})

) RETURN p
NCI: Graph connections
made by RD-Switchboard
using NCI’s metadata
nci.org.au
@NCInews
nci.org.au
Research Data Collections 10PB+
CMIP5
3PB
Astronomy
(Optical)
200 TB
Water
Ocean
1.5 PB
Atmosphere
2.4 PB
Earth
Observ.
2 PB
Marine
Videos
10 TB
Geophysics
300 TB
Weather
340 TB
© National Computational
Infrastructure 2015
NERDIP: National Environment
Research Data Platform
nci.org.aunci.org.au
Each individual catalogue record describes a linear relationship among entities:
© National Computational
Infrastructure 2015
Current research record status
Researcher A
use
Data 1
Supported
by Grant a Paper I, II
Generate
Researcher B Data 1
Supported
by Grant b Paper II, III
use Generate
Researcher B Data 2
Supported
by Grant b Paper IV
use Generate
nci.org.aunci.org.au
Relational database is converted and presented in graph database using
Research Data Switchboard (RD-Switchboard):
© National Computational
Infrastructure 2015
Graph database structure
Researcher A
use Supported
by
Grant a Paper I
Generate
Researcher B
Data 1
Supported
by
Grant b Paper IIIuse
Generate
Data 2
Supported
by
Paper IV
use Generate
Paper IIGenerate
nci.org.au
User question: RD-switchboard query:
nci.org.aunci.org.au
NCI GeoNetwork architecture http://geonetwork.nci.org.au
© National Computational
Infrastructure 2015
Catalogue system infrastructure
nci.org.au
Harvest and
synchronization
nci.org.au
nci.org.au
RD-Switchboard benefits so far…
© National Computational
Infrastructure 2015
• Identify the missing critical metadata entries;
• Identify errors in the catalogue entries;
• Provide analytical view of how research data has been
used so far (high-level of utilisation or underutilised?);
• Evaluate the impact of the datasets, researchers and
institutes;
• Encourage the usage of URI, DOI and ORCID, etc.
nci.org.au
researcher 2researcher 1 paper 2paper 1 dataset
Any conflict of interest?
Possible collaboration?
data2 data3
data4 data5
nci.org.au
•
•
•
eResearch BOF
Tuesday 11 October 2016 / 16:35
BoF: Research Graph: Connecting Researchers,
Research Data, Publications and Grants using the
Graph Technology 

Dr. Amir Aryani
amir.aryani@ands.org.au
Twitter: @amir_at_ands
Dr. Jingbo Wang (NCI)
jingbo.wang@anu.edu.au

More Related Content

PPTX
Networking Materials Data
PPTX
Providing Research Graph data in JSON-LD using Schema.org
PDF
re3data - Registry of Research Data Repositories
PDF
TIB's action for research data managament as a national library's strategy in...
PPT
DataCite and its DOI infrastructure - IASSIST 2013
PDF
Linked data intro primer
PPT
Data Citation in The Dataverse Network
PDF
RDF Data and Image Annotations in ResearchSpace (slides)
Networking Materials Data
Providing Research Graph data in JSON-LD using Schema.org
re3data - Registry of Research Data Repositories
TIB's action for research data managament as a national library's strategy in...
DataCite and its DOI infrastructure - IASSIST 2013
Linked data intro primer
Data Citation in The Dataverse Network
RDF Data and Image Annotations in ResearchSpace (slides)

What's hot (20)

PPTX
Hdf Augmentation: Interoperability in the Last Mile
PPT
Friday talk 11.02.2011
PDF
Relations for Reusing (R4R) in A Shared Context: An Exploration on Research P...
PDF
20160818 Semantics and Linkage of Archived Catalogs
PDF
Tools für das Management von Forschungsdaten
PDF
How to clean data less through Linked (Open Data) approach?
PDF
20161004 “Open Data Web” – A Linked Open Data Repository Built with CKAN
PPTX
The HDF Product Designer – Interoperability in the First Mile
PPTX
EDI Training Module 12: An Introduction to Metadata and Data Repositories
PPT
Webtracks at JISC Managing Research Data Meeting
PPTX
EDI Training Module 2: EDI Project
PPT
Who will use the open data? Mark Humphries keynote
PPT
Jan Brase: Data and Libraries - the DataCite consortium
PDF
Accelerating your research with Microsoft Azure
PPTX
Data exchange alternatives, GIGA TAG (2009)
PDF
Introduction to the Environmental Data Initiative (EDI)
PDF
Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...
PDF
Tracking research data footprints - slides
PDF
A First Attempt at Describing, Disseminating and Reusing Methodological Knowl...
PDF
TDWG VoMaG Vocabulary management workflow, 2013-10-31
Hdf Augmentation: Interoperability in the Last Mile
Friday talk 11.02.2011
Relations for Reusing (R4R) in A Shared Context: An Exploration on Research P...
20160818 Semantics and Linkage of Archived Catalogs
Tools für das Management von Forschungsdaten
How to clean data less through Linked (Open Data) approach?
20161004 “Open Data Web” – A Linked Open Data Repository Built with CKAN
The HDF Product Designer – Interoperability in the First Mile
EDI Training Module 12: An Introduction to Metadata and Data Repositories
Webtracks at JISC Managing Research Data Meeting
EDI Training Module 2: EDI Project
Who will use the open data? Mark Humphries keynote
Jan Brase: Data and Libraries - the DataCite consortium
Accelerating your research with Microsoft Azure
Data exchange alternatives, GIGA TAG (2009)
Introduction to the Environmental Data Initiative (EDI)
Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...
Tracking research data footprints - slides
A First Attempt at Describing, Disseminating and Reusing Methodological Knowl...
TDWG VoMaG Vocabulary management workflow, 2013-10-31
Ad

Viewers also liked (9)

DOCX
Yakup Hakan Kalem CV - ln (2)
PPT
Chap 004
RTF
Que es java
PDF
Guide hs-blanchiment
PDF
Guía didáctica PREZI
DOCX
Unit plan pop quiz
RTF
Carta compromiso
PPTX
Spring batch example
PPT
Solr and Elasticsearch in Action (at Breizhcamp)
Yakup Hakan Kalem CV - ln (2)
Chap 004
Que es java
Guide hs-blanchiment
Guía didáctica PREZI
Unit plan pop quiz
Carta compromiso
Spring batch example
Solr and Elasticsearch in Action (at Breizhcamp)
Ad

Similar to Using Neo4j for exploring the research graph connections made by RD-Switchboard (20)

PDF
Making Data Dynamic: Views from UC3, CDL
PDF
Mendeley Data: Enhancing Data Discovery, Sharing and Reuse
PDF
re3data.org – Registry of Research Data Repositories
PPTX
The Rocky Road to Reuse
PPTX
Scholze liber 2015-06-25_final
PDF
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...
PDF
Visualising Research Graph using Neo4j and Gephi
PPTX
DataONE Education Module 08: Data Citation
PDF
Customisable cross-database Bio2RDF queries
PPT
The eCrystals Federation
PDF
Preparing for the UK Research Data Registry and Discovery Service
PPTX
Why Data Science Matters - 2014 WDS Data Stewardship Award Lecture
PPTX
Opening up data – Jisc and CNI conference 10 July 2014
PPTX
April 23 NISO Virtual Conference: Dealing with the Data Deluge: Successful Te...
PDF
Rolle und Perspektive von re3data.org bei der Förderung von Open Science
PDF
NISO Webinar on data curation services at the CDL
PDF
ODIN Final Event - The Care and Feeding of Scientific Data
PPTX
2013 DataCite Summer Meeting - Out of Cite, Out of Mind: Report of the CODATA...
PPTX
ElN - repository integration at the University of Goettingen
PDF
Open Science - Global Perspectives/Simon Hodson
Making Data Dynamic: Views from UC3, CDL
Mendeley Data: Enhancing Data Discovery, Sharing and Reuse
re3data.org – Registry of Research Data Repositories
The Rocky Road to Reuse
Scholze liber 2015-06-25_final
Enriching Scholarship 2014 Beyond the Journal Article: Publishing and Citing ...
Visualising Research Graph using Neo4j and Gephi
DataONE Education Module 08: Data Citation
Customisable cross-database Bio2RDF queries
The eCrystals Federation
Preparing for the UK Research Data Registry and Discovery Service
Why Data Science Matters - 2014 WDS Data Stewardship Award Lecture
Opening up data – Jisc and CNI conference 10 July 2014
April 23 NISO Virtual Conference: Dealing with the Data Deluge: Successful Te...
Rolle und Perspektive von re3data.org bei der Förderung von Open Science
NISO Webinar on data curation services at the CDL
ODIN Final Event - The Care and Feeding of Scientific Data
2013 DataCite Summer Meeting - Out of Cite, Out of Mind: Report of the CODATA...
ElN - repository integration at the University of Goettingen
Open Science - Global Perspectives/Simon Hodson

More from amiraryani (10)

PDF
Using the Research Graph and Data Switchboard for cross-platform discovery
PDF
Research Data Alliance Plenary 9: DDRI Working Group Session
PDF
Research Graph: Connecting Identifiers across Research Data Infrastructures
PDF
ORCID in RD-Switchboard
PPTX
Research Data and the Future of Software Engineering
PPTX
Report from RDAPlenary 3 to DataCitation Community in Australia
PPTX
Data Description Registry Interoperability WG at Research Data Alliance Third...
PPT
ORCID integration: A case study from ANDS and international development
PPTX
Identity Awareness: Toward an Invisible e-Infrastructure for Identifying Data...
PDF
Can we predict dependencies using domain information?
Using the Research Graph and Data Switchboard for cross-platform discovery
Research Data Alliance Plenary 9: DDRI Working Group Session
Research Graph: Connecting Identifiers across Research Data Infrastructures
ORCID in RD-Switchboard
Research Data and the Future of Software Engineering
Report from RDAPlenary 3 to DataCitation Community in Australia
Data Description Registry Interoperability WG at Research Data Alliance Third...
ORCID integration: A case study from ANDS and international development
Identity Awareness: Toward an Invisible e-Infrastructure for Identifying Data...
Can we predict dependencies using domain information?

Recently uploaded (20)

PDF
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
PDF
Digital Strategies for Manufacturing Companies
PDF
top salesforce developer skills in 2025.pdf
PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PDF
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
PPTX
Transform Your Business with a Software ERP System
PPTX
Computer Software and OS of computer science of grade 11.pptx
PPTX
L1 - Introduction to python Backend.pptx
PDF
Nekopoi APK 2025 free lastest update
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
PDF
Upgrade and Innovation Strategies for SAP ERP Customers
PDF
Softaken Excel to vCard Converter Software.pdf
PDF
Odoo Companies in India – Driving Business Transformation.pdf
PPTX
Introduction to Artificial Intelligence
PPTX
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 41
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PPTX
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
PDF
Wondershare Filmora 15 Crack With Activation Key [2025
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
Digital Strategies for Manufacturing Companies
top salesforce developer skills in 2025.pdf
Which alternative to Crystal Reports is best for small or large businesses.pdf
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
Transform Your Business with a Software ERP System
Computer Software and OS of computer science of grade 11.pptx
L1 - Introduction to python Backend.pptx
Nekopoi APK 2025 free lastest update
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
Upgrade and Innovation Strategies for SAP ERP Customers
Softaken Excel to vCard Converter Software.pdf
Odoo Companies in India – Driving Business Transformation.pdf
Introduction to Artificial Intelligence
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
Internet Downloader Manager (IDM) Crack 6.42 Build 41
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
Wondershare Filmora 15 Crack With Activation Key [2025
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx

Using Neo4j for exploring the research graph connections made by RD-Switchboard