SlideShare a Scribd company logo
Cross-domain data
discovery and integration
Simon J D Cox
CSIRO Land and Water
7 November 2018
Finding data for x-disciplinary applications
Use cross-domain data catalogs
SCIDATACON 2018-11-07 | Cox et al. | Harmonising data description2
|
SCIDATACON 2018-11-07 | Cox | x-domain discovery and integration
SCIDATACON 2018-11-07 | Cox | x-domain discovery and integration
Generic
dataset
metadata
SCIDATACON 2018-11-07 | Cox | x-domain discovery and integration
Metadata standards
SCIDATACON 2018-11-07 | Cox | x-domain discovery and integration
Primarily
domain usage
Variable level
Study level
Generic
applications
SSN/SOSA
DCAT
19115
EMLDQV
FHIR
HL7Q
DATS
DDI
CERIF
QB
SCIDATACON 2018-11-07 | Cox | x-domain discovery and integration
Observation metadata
SCIDATACON 2018-11-07 | Cox | x-domain discovery and integration
SSN Observation
- resultTime
- madeBySensor
- usedProcedure
- hasFeatureOfInterest
- hasResult
- phenomenonTime
- observedProperty
Lining up natural and social science concepts
SCIDATACON 2018-11-07 | Cox | x-domain discovery and integration
Controlled vocabularies
Provenance/context
Observation properties
Dataset statistics
SCIDATACON 2018-11-07 | Cox | x-domain discovery and integration
Metadata mashup
Is β€˜metadata’ the solution?
β€’ Discovery might be done other ways
SCIDATACON 2018-11-07 | Cox | x-domain discovery and integration
Metadata
property -> data source (type)
Semantic Data Lake – Envisaged Architecture
Decomposing
User QuerySPARQL query
Database
XML
File
?item gho:Country ?country .
?item gho:Disease ?disease .
...
SELECT country, disease, ...
FROM Observations
Finding Relevant Data Sources
+ Queries Translation
SQL XPathSQL
MongoDB
JSONPath
SQL
XML
MongoDB
Execution Plan
SCIDATACON 2018-11-07 | Cox | x-domain discovery and integration
Run-time integration?
Thank you
Simon J D Cox
Research Scientist
CSIRO Land and Water
simon.cox@csiro.au.au
SCIDATACON 2018-11-07 | Cox | x-domain discovery and integration
Dataset catalog – schema.org
SCIDATACON 2018-11-07 | Cox | x-domain discovery and integration
W3C Dataset catalog vocabulary - DCAT
SCIDATACON 2018-11-07 | Cox | x-domain discovery and integration
SCIDATACON 2018-11-07 | Cox | x-domain discovery and integration
CKAN β™₯ DCAT

More Related Content

PPTX
Web standards support science data
PPTX
PROV ontology supports alignment of observational data (models)
PPTX
Vocabularies, ontologies, standards for observations: developments from RDA, ...
PPTX
Pitfalls in alignment of observation models resolved using PROV as an upper o...
PPTX
General Overview of the COBWEB Project - Bart De Lathouwer and Chris Higgins
PPTX
Vocabularies and vocabulary services for water data
PPTX
BDE: Concepts, Platform and Pilots
PPT
ERFEG Seminar Fall 2008
Web standards support science data
PROV ontology supports alignment of observational data (models)
Vocabularies, ontologies, standards for observations: developments from RDA, ...
Pitfalls in alignment of observation models resolved using PROV as an upper o...
General Overview of the COBWEB Project - Bart De Lathouwer and Chris Higgins
Vocabularies and vocabulary services for water data
BDE: Concepts, Platform and Pilots
ERFEG Seminar Fall 2008

What's hot (20)

PPTX
A low-cost sensor network to monitor the CO2 emissions of the city of Zurich
PPT
Ian Grant_Adoption of AusCover data standards and systems to improve access t...
PPTX
The Role of SC5 in the BDE Project
PPSX
Ict 2019 v2
PDF
PR173 : Automatic Chemical Design Using a Data-Driven Continuous Representati...
PPTX
OpenAIRE Open Science publishing for Research Infrastructures: the EPOS use-c...
PDF
Nuclear emergency response and Big Data technologies
PPT
Introduction to STILT – an on-demand CO2 footprint calculator service
Β 
PPTX
The Implementation of the International Geo Sample Number in CSIRO: Experienc...
PPTX
Second SC5 Pilot: Identifying the Release Location of a Substance
PPTX
Eco-informatics: Data services for bringing together and publishing the full ...
PPS
GBA Data viewer
PDF
Federated Galaxy: Biomedical Computing at the Frontier
PPT
Quality of ground data for assessment and benchmarking
PPTX
Incorporating OCR into a digitisation and curation workflow
PDF
Towards A Web-Enabled Geo-Sample Web: An Open Source Resource Registration an...
PDF
Green Button Technical Overview
PDF
AusPlots field data collection with AusScribe
PDF
Who are cams users today by Popp
PPTX
CoESRA: Platform for collaborative research
A low-cost sensor network to monitor the CO2 emissions of the city of Zurich
Ian Grant_Adoption of AusCover data standards and systems to improve access t...
The Role of SC5 in the BDE Project
Ict 2019 v2
PR173 : Automatic Chemical Design Using a Data-Driven Continuous Representati...
OpenAIRE Open Science publishing for Research Infrastructures: the EPOS use-c...
Nuclear emergency response and Big Data technologies
Introduction to STILT – an on-demand CO2 footprint calculator service
Β 
The Implementation of the International Geo Sample Number in CSIRO: Experienc...
Second SC5 Pilot: Identifying the Release Location of a Substance
Eco-informatics: Data services for bringing together and publishing the full ...
GBA Data viewer
Federated Galaxy: Biomedical Computing at the Frontier
Quality of ground data for assessment and benchmarking
Incorporating OCR into a digitisation and curation workflow
Towards A Web-Enabled Geo-Sample Web: An Open Source Resource Registration an...
Green Button Technical Overview
AusPlots field data collection with AusScribe
Who are cams users today by Popp
CoESRA: Platform for collaborative research
Ad

Similar to Cross-domain data discovery and integration (20)

PPTX
2025 USGS Science Data Management Briefing
Β 
PDF
Functional and Architectural Requirements for Metadata: Supporting Discovery...
PDF
Next-Generation Search Engines for Information Retrieval
PPTX
Introduction to Spatial Data Infrastructure
PDF
Unidata's Approach to Community Broadening through Data and Technology Sharing
PDF
CLIM Program: Remote Sensing Workshop, Distributed Access and Analysis: NASA ...
PPTX
Metadata Mapping & Crosswalks
PPTX
DataONE Education Module 07: Metadata
PPTX
L07 metadata
PPTX
SPatially Explicit Data Discovery, Extraction and Evaluation Services (SPEDDE...
PPT
The Concepts of Spatial Data Infrastructure and components.ppt
PDF
Full Erdmann Ruttenberg Community Approaches to Open Data at Scale
PPT
New Directions in Metadata
Β 
PPTX
Some problems with standard geospatial metadata
PPTX
FSCI Data Discovery
Β 
PDF
A Data-driven Approach for Internet of Things Applications: Methods and Case ...
PPT
jamstec-rew.ppt
PPTX
B2FIND Integration | www.eudat.eu |
Β 
PPTX
Documentation and Metdata - VA DM Bootcamp
PPTX
Emerging domain agnostic functionalities on the handle-centered networks
2025 USGS Science Data Management Briefing
Β 
Functional and Architectural Requirements for Metadata: Supporting Discovery...
Next-Generation Search Engines for Information Retrieval
Introduction to Spatial Data Infrastructure
Unidata's Approach to Community Broadening through Data and Technology Sharing
CLIM Program: Remote Sensing Workshop, Distributed Access and Analysis: NASA ...
Metadata Mapping & Crosswalks
DataONE Education Module 07: Metadata
L07 metadata
SPatially Explicit Data Discovery, Extraction and Evaluation Services (SPEDDE...
The Concepts of Spatial Data Infrastructure and components.ppt
Full Erdmann Ruttenberg Community Approaches to Open Data at Scale
New Directions in Metadata
Β 
Some problems with standard geospatial metadata
FSCI Data Discovery
Β 
A Data-driven Approach for Internet of Things Applications: Methods and Case ...
jamstec-rew.ppt
B2FIND Integration | www.eudat.eu |
Β 
Documentation and Metdata - VA DM Bootcamp
Emerging domain agnostic functionalities on the handle-centered networks
Ad

More from Simon Cox (16)

PPTX
The SOSA ontology
PPTX
A common model for scientific observations and samples
PPTX
Prov and real things
PPTX
A standard for geospatial observations and measurements
PPTX
OWL-Time and enhancements
PPTX
OM-JSON - a JSON implementation of O&M
PPTX
O&M Specimen model – alignments with PROV, BCO
PPTX
Ontology alignment – is PROV-O good enough?
PPTX
Re-use of standard ontologies in a water quality vocabulary
PPTX
Observations to Information
PPTX
A harmonized vocabulary for water quality
PPTX
Harmonization of vocabularies for water data
PPT
Information Viewpoints and Geoscience Service Architectures
PPT
Leverage and Delegation in Developing an Information Model for Geology
PPTX
Technologies and practices for maintaining and publishing earth science vocab...
PPTX
Developing and publishing vocabularies
The SOSA ontology
A common model for scientific observations and samples
Prov and real things
A standard for geospatial observations and measurements
OWL-Time and enhancements
OM-JSON - a JSON implementation of O&M
O&M Specimen model – alignments with PROV, BCO
Ontology alignment – is PROV-O good enough?
Re-use of standard ontologies in a water quality vocabulary
Observations to Information
A harmonized vocabulary for water quality
Harmonization of vocabularies for water data
Information Viewpoints and Geoscience Service Architectures
Leverage and Delegation in Developing an Information Model for Geology
Technologies and practices for maintaining and publishing earth science vocab...
Developing and publishing vocabularies

Recently uploaded (20)

PDF
WebRTC in SignalWire - troubleshooting media negotiation
PDF
Smart Home Technology for Health Monitoring (www.kiu.ac.ug)
PPTX
newyork.pptxirantrafgshenepalchinachinane
PPTX
Introduction about ICD -10 and ICD11 on 5.8.25.pptx
PPTX
introduction about ICD -10 & ICD-11 ppt.pptx
PDF
Exploring VPS Hosting Trends for SMBs in 2025
PPTX
Introuction about ICD -10 and ICD-11 PPT.pptx
PDF
Introduction to the IoT system, how the IoT system works
PPTX
PptxGenJS_Demo_Chart_20250317130215833.pptx
PPTX
Slides PPTX World Game (s) Eco Economic Epochs.pptx
DOCX
Unit-3 cyber security network security of internet system
PDF
πŸ’° π”πŠπ“πˆ πŠπ„πŒπ„ππ€ππ†π€π πŠπˆππ„π‘πŸ’πƒ π‡π€π‘πˆ 𝐈𝐍𝐈 πŸπŸŽπŸπŸ“ πŸ’°
Β 
PPTX
SAP Ariba Sourcing PPT for learning material
PPTX
Funds Management Learning Material for Beg
PPTX
Job_Card_System_Styled_lorem_ipsum_.pptx
PPT
Design_with_Watersergyerge45hrbgre4top (1).ppt
PPTX
Introuction about WHO-FIC in ICD-10.pptx
PPTX
Power Point - Lesson 3_2.pptx grad school presentation
Β 
PDF
Unit-1 introduction to cyber security discuss about how to secure a system
PDF
Tenda Login Guide: Access Your Router in 5 Easy Steps
WebRTC in SignalWire - troubleshooting media negotiation
Smart Home Technology for Health Monitoring (www.kiu.ac.ug)
newyork.pptxirantrafgshenepalchinachinane
Introduction about ICD -10 and ICD11 on 5.8.25.pptx
introduction about ICD -10 & ICD-11 ppt.pptx
Exploring VPS Hosting Trends for SMBs in 2025
Introuction about ICD -10 and ICD-11 PPT.pptx
Introduction to the IoT system, how the IoT system works
PptxGenJS_Demo_Chart_20250317130215833.pptx
Slides PPTX World Game (s) Eco Economic Epochs.pptx
Unit-3 cyber security network security of internet system
πŸ’° π”πŠπ“πˆ πŠπ„πŒπ„ππ€ππ†π€π πŠπˆππ„π‘πŸ’πƒ π‡π€π‘πˆ 𝐈𝐍𝐈 πŸπŸŽπŸπŸ“ πŸ’°
Β 
SAP Ariba Sourcing PPT for learning material
Funds Management Learning Material for Beg
Job_Card_System_Styled_lorem_ipsum_.pptx
Design_with_Watersergyerge45hrbgre4top (1).ppt
Introuction about WHO-FIC in ICD-10.pptx
Power Point - Lesson 3_2.pptx grad school presentation
Β 
Unit-1 introduction to cyber security discuss about how to secure a system
Tenda Login Guide: Access Your Router in 5 Easy Steps

Cross-domain data discovery and integration

Editor's Notes

  • #7: There – two dirty words in one title!
  • #13: The Squerall project has proved initially promising results but needs further development. Performance is better than the traditional integration at ingestion time - i.e. mapping everything to a single unified data model - as you only ever query relevant data, not the whole lot, looking for things that aren’t there.