SlideShare a Scribd company logo
http://mint-project.info
OKG-SOFT: AN OPEN KNOWLEDGE
GRAPH WITH MACHINE READABLE
SCIENTIFIC SOFTWARE METADATA
Daniel Garijo, Maximiliano Osorio, Deborah Khider,
Varun Ratnakar and Yolanda Gil
University of Southern California,
Information Sciences Institute
@dgarijov
OKN-AKBC
May 22nd,Amherst, USAInformation
Sciences
Institute
http://mint-project.info
Science is changing: Open Science
2
Open publications
Open data
Open access
Open source software
Impact
and credit
OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019
http://mint-project.info
The importance of Scientific Software
3
Open publications
Open data
Open source software
• Software helps understand data
• Provenance, reproducibility
• Software helps understanding methods
• Assumptions, limitations
OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019
http://mint-project.info
Why is it difficult to reuse Scientific Software?
Software A
Hydrology Model
Weather DEM Infiltration
Outflow Error
FLDAS
(climate)
Remote
sensing
Let’s imagine we want to reuse existing work:
Software B
Map-based visualizations
?
?
?
How do I to transform data?
How do I invoke code?
How do I interpret results?
OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019
4
http://mint-project.info
Outline
5
1. Requirements help scientific software reusability
2. Our current approach for representing scientific
software metadata
3. A framework to query, explore, exploit and publish
software metadata
OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019
http://mint-project.info
Outline
6
1. Requirements help scientific software reusability
2. Our current approach for representing scientific
software metadata
3. A framework to query, explore, exploit and publish
software metadata
OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019
http://mint-project.info
Requirements for Software Reusability
7
1. Exposing software inputs, outputs and their corresponding variables
Hydrology Software
Model
OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019
Weather DEM Infiltration
Outflow Error
Input1 Input2 Input3
Output1 Output2
- Land surface temperature (degC)
- Precipitation rate (mm/h)
- Land surface wind speed (m/day)
- Net radiation (MJ/(day m^2)
http://mint-project.info
Requirements for Software Reusability
8
1. Exposing software inputs, outputs and their corresponding variables
2. Capturing the functions of the software being used
OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019
Hydrology Software Model
Function A: Richards
Equation for water
movement (unsat soil)
Function B: Saint Venant
equations
(shallow water)
http://mint-project.info
Requirements for Software Reusability
9
1. Exposing software inputs, outputs and their corresponding variables
2. Capturing the functions of software being used
3. Using principled ontologies with structured names for model variables,
processes, and methods
OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019
Temp
T
T_C
svo:land_surface_
air__temperature
http://mint-project.info
Requirements for Software Reusability
10
1. Exposing software inputs, outputs and their corresponding variables
2. Capturing the functions of software being used
3. Using principled ontologies with structured names for model variables,
processes, and methods
4. Capture the semantic structure of software invocations
OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019
Dependencies?
Sample runs?
Invocation command?
Is data supposed to be in the same folder?
Default arguments/Configuration files?
Volumes?
Do I have to log in in the image
http://mint-project.info
Outline
11
1. Requirements help scientific software reusability
2. Our current approach for representing scientific
software metadata
3. A framework to query, explore, exploit and publish
software metadata
OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019
http://mint-project.info
Prior Work: OntoSoft Software Metadata Registry
12
OntoSoft
Model and Software Metadata Registry
• Complements code repositories to
make them understandable
• Software metadata designed for
scientists
• Metadata is curated by decentralized
communities of users
• Training scientists on best practices
http://ontosoft.org
Finding Software
OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019
[Gil et al 2015]: OntoSoft: Capturing Scientific Software Metadata Eighth ACM International
Conference on Knowledge Capture, Palisades, NY, 2015
http://mint-project.info
Prior Work: OntoSoft Software Metadata Registry
13
OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019
PIHM PIHMgis DrEICH TauDEM WBMsed
http://mint-project.info
Evolving OntoSoft: Software Description Ontology
https://w3id.org/okn/o/sd#
Extensions:
• Schema.org (software metadata)
• W3C Data Cubes (Contents of inputs and outputs)
• NASA QUDT (Units)
• DockerPedia (Software images)
• Scientific Variables Ontology (Standard Variables)
OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019
14
http://mint-project.info
Evolving OntoSoft: Extending schema.org and Codemeta
https://w3id.org/okn/o/sd#
OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019
15
http://mint-project.info
https://w3id.org/okn/o/sd#
Extensions:
• Schema.org (software metadata)
• W3C Data Cubes (Contents of inputs and outputs)
• NASA QUDT (Units)
• DockerPedia (Software images)
• Scientific Variables Ontology (Standard Variables)
OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019
16
Evolving OntoSoft: Software Description Ontology
http://mint-project.info
Describing Input/Output files, parameters and
variables
17
M. Stoica and S. D. Peckham, “An Ontology Blueprint for Constructing Qualitative and Quantitative
Scientific Variables,” in Proceedings of the ISWC 2018 Posters & Demonstrations, Industry and Blue Sky
Ideas Tracks co-located with 17th International Semantic Web Conference (ISWC 2018), Monterey, USA,
October 8th - to - 12th, 2018., 2018
Scientific Variables Ontology identifiers
http://mint-project.info
https://w3id.org/okn/o/sd#
Extensions:
• Schema.org (software metadata)
• W3C Data Cubes (Contents of inputs and outputs)
• NASA QUDT (Units)
• DockerPedia (Software images)
• Scientific Variables Ontology (Standard Variables)
OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019
18
Evolving OntoSoft: Software Description Ontology
http://mint-project.info
Machine readable representation of units
19
“RAIN”
mm/day
CCUT
Linking and
augmenting from
Wikidata
• B. Shbita, A. Rajendran, J. Pujara, and C. Knoblock, Parsing, Representing and Transforming Units of Measure, in Modeling the World's Systems, 2019.
http://mint-project.info
https://w3id.org/okn/o/sd#
Extensions:
• Schema.org (software metadata)
• W3C Data Cubes (Contents of inputs and outputs)
• NASA QUDT (Units)
• DockerPedia (Software images)
• Scientific Variables Ontology (Standard Variables)
OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019
20
Evolving OntoSoft: Describing Containers
http://mint-project.infoOKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019
21
Evolving OntoSoft: Describing Containers
“mintproject/
pihm2cycles”
Docker
pedia
Image tag
Image semantic representation
https://dockerpedia.inf.utfsm.cl/
• M. Osorio, H. Vargas, and C. Buil Aranda, “DockerPedia: a Knowledge Graph of Docker Images,” in Proceedings of the ISWC 2018 Posters &
Demonstrations, Industry and Blue Sky Ideas Tracks co-located with 17th International Semantic Web Conference (ISWC 2018), Monterrey, 2018.
http://mint-project.info
Outline
22
1. Requirements help scientific software reusability
2. Our current approach for representing scientific
software metadata
3. A framework to query, explore, exploit and publish
software metadata
OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019
http://mint-project.info
OKG-SOFT: Framework
23
Software Model Catalog contains:
• Models from hydrology, agriculture and economy, their versions and model
configurations.
• More than 200 variables mapped to SVO.
• All models are executable through scientific workflows
• Most contents are added manually (expert users) collaboratively
• Automated unit transformations
• Automated software image description
• Semi-automated Wikidata linking
OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019
https://query.mint.isi.edu/api/mintproject/MINT-ModelCatalogQueries#/
APIs:
• SPARQL endpoint
• REST APIs (GET/POST)
• Python clients
http://mint-project.info
Exploitation: Exploring Scientific Software Model
Metadata
24http://models.mint.isi.edu
Explore variables
OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019
Explore Software I/O
Find Software Models
http://mint-project.info
Exploitation: Comparing Scientific Software Models
25
http://models.mint.isi.edu
OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019
http://mint-project.info
Exploitation: Towards Automated Software
Composition
26
OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019
http://mint-project.info
Summary
27
Scientific software reusability is crucial to understand
• Existing data
• Published methods
1. Requirements for scientific software reusability include
• Expose inputs, outputs, variables and software invocation details!
2. Our approach for capturing and structuring scientific software
3. A framework to query, explore, exploit and publish software
metadata
OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019
http://mint-project.info
Help us making your software more reusable
28
We would like to thank Scott Peckham, Maria Stoica, Chris Duffy, Lele Shu, Kelly
Cobourn, Zeya Zhang Suzanne Pierce, Armen Kemanian, Rajiv Mayani, Jay Puajara,
Basel Shbita, Dhruv Pattel, Rohit Mayura, Amrish Goel and Anuj Doiphode
OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019
Contact me: dgarijo@isi.edu

More Related Content

PPTX
Scientific Software Registry Collaboration Workshop: From Software Metadata r...
PDF
FAIR Workflows: A step closer to the Scientific Paper of the Future
PPTX
Towards Automating Data Narratives
PPTX
Towards Knowledge Graphs of Reusable Research Software Metadata
PDF
FOOPS!: An Ontology Pitfall Scanner for the FAIR principles
PPTX
A Template-Based Approach for Annotating Long-Tailed Datasets
PDF
Coming to terms to FAIR semantics
PPTX
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
Scientific Software Registry Collaboration Workshop: From Software Metadata r...
FAIR Workflows: A step closer to the Scientific Paper of the Future
Towards Automating Data Narratives
Towards Knowledge Graphs of Reusable Research Software Metadata
FOOPS!: An Ontology Pitfall Scanner for the FAIR principles
A Template-Based Approach for Annotating Long-Tailed Datasets
Coming to terms to FAIR semantics
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs

What's hot (20)

PPTX
Towards Human-Guided Machine Learning - IUI 2019
PPTX
Towards Reusable Research Software
PDF
第1回バイオインフォマティクスデータ可視化セミナー@Riken
PDF
Hahn "Wikidata as a hub to library linked data re-use"
PPTX
FAIR Computational Workflows
PPTX
Better software, better service, better research: The Software Sustainabilit...
PDF
OpenVis Conference Report Part 1 (and Introduction to D3.js)
PPTX
Publishing your research: Research Data Management (Introduction)
PPTX
How to expand the Galaxy from genes to Earth in six simple steps (and live sm...
PDF
Towards an Infrastructure for Enabling Systematic Development and Research of...
PPTX
Research Object Community Update
PPTX
FAIR History and the Future
PPTX
RO-Crate: A framework for packaging research products into FAIR Research Objects
PDF
Kohlmeier "Innovations in Academic Search & Discovery - A Case Study From the...
PPTX
Software Sustainability: Better Software Better Science
PPTX
ELIXIR UK Node presentation to the ELIXIR Board
PPTX
FAIR Computational Workflows
PDF
intensive metrics software evolution
PDF
ownR platform technical description
PDF
Intro to Graphs and Neo4j
Towards Human-Guided Machine Learning - IUI 2019
Towards Reusable Research Software
第1回バイオインフォマティクスデータ可視化セミナー@Riken
Hahn "Wikidata as a hub to library linked data re-use"
FAIR Computational Workflows
Better software, better service, better research: The Software Sustainabilit...
OpenVis Conference Report Part 1 (and Introduction to D3.js)
Publishing your research: Research Data Management (Introduction)
How to expand the Galaxy from genes to Earth in six simple steps (and live sm...
Towards an Infrastructure for Enabling Systematic Development and Research of...
Research Object Community Update
FAIR History and the Future
RO-Crate: A framework for packaging research products into FAIR Research Objects
Kohlmeier "Innovations in Academic Search & Discovery - A Case Study From the...
Software Sustainability: Better Software Better Science
ELIXIR UK Node presentation to the ELIXIR Board
FAIR Computational Workflows
intensive metrics software evolution
ownR platform technical description
Intro to Graphs and Neo4j
Ad

Similar to OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software Metadata (20)

PDF
OntoSoft: A Distributed Semantic Registry for Scientific Software
PDF
Oscon 2011 Practicing Open Science
PPTX
The swings and roundabouts of a decade of fun and games with Research Objects
PDF
Software Metadata: Describing "dark software" in GeoSciences
PPTX
Data accessibility and the role of informatics in predicting the biosphere
PPTX
Hughes RDAP11 Data Publication Repositories
PDF
Next-Generation Search Engines for Information Retrieval
PDF
Towards the Wikipedia of World Wide Sensors
PDF
Open Source Visualization of Scientific Data
PDF
FAIR data requires FAIR ontologies, how do we do?
PPTX
Data Facilities Workshop - Panel on Current Concepts in Data Sharing & Intero...
PPT
Osgis2011 edina addy_pope
PPT
Osgis2011 edina addy_pope
PDF
Open Chemistry: Input Preparation, Data Visualization & Analysis
PDF
Avogadro, Open Chemistry and Semantics
PDF
WSSSPE: Building communities
PDF
Oscon 2011 schroeder
PDF
SFSCON23 - Christian Busse - Free Software and Open Science
PDF
Cuashi2008revisited
PPTX
Possibilities of Open Source Code
OntoSoft: A Distributed Semantic Registry for Scientific Software
Oscon 2011 Practicing Open Science
The swings and roundabouts of a decade of fun and games with Research Objects
Software Metadata: Describing "dark software" in GeoSciences
Data accessibility and the role of informatics in predicting the biosphere
Hughes RDAP11 Data Publication Repositories
Next-Generation Search Engines for Information Retrieval
Towards the Wikipedia of World Wide Sensors
Open Source Visualization of Scientific Data
FAIR data requires FAIR ontologies, how do we do?
Data Facilities Workshop - Panel on Current Concepts in Data Sharing & Intero...
Osgis2011 edina addy_pope
Osgis2011 edina addy_pope
Open Chemistry: Input Preparation, Data Visualization & Analysis
Avogadro, Open Chemistry and Semantics
WSSSPE: Building communities
Oscon 2011 schroeder
SFSCON23 - Christian Busse - Free Software and Open Science
Cuashi2008revisited
Possibilities of Open Source Code
Ad

More from dgarijo (19)

PDF
SOMEF: a metadata extraction framework from software documentation
PPTX
WDPlus: Leveraging Wikidata to Link and Extend Tabular Data
PPTX
Capturing Context in Scientific Experiments: Towards Computer-Driven Science
PPTX
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...
PPTX
WIDOCO: A Wizard for Documenting Ontologies
PDF
Automated Hypothesis Testing with Large Scale Scientific Workflows
PDF
OEG tools for supporting Ontology Engineering
PPTX
Reproducibility Using Semantics: An Overview
PPTX
PhD Thesis: Mining abstractions in scientific workflows
PPTX
Publicación de datos y métodos científicos en investigación
PPTX
EDBT 2015: Summer School Overview
PDF
Similarity in Wikipedia Articles (EDBT Summer School)
PPTX
Semantic web 101: Benefits for geologists
PPTX
Is preserving data enough? Towards the preservation of scientific methods
PPTX
Creating abstractions from scientific workflows: PhD symposium 2015
PDF
Towards Workflow Ecosystems Through Semantic and Standard Representations
PDF
Workflow Reuse in Practice: A Study of Neuroimaging Pipeline Users
PDF
Frag Flow: Automated Fragment Detection in Scientific Workflows
PPTX
User requirments for geospatial provenance
SOMEF: a metadata extraction framework from software documentation
WDPlus: Leveraging Wikidata to Link and Extend Tabular Data
Capturing Context in Scientific Experiments: Towards Computer-Driven Science
A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Met...
WIDOCO: A Wizard for Documenting Ontologies
Automated Hypothesis Testing with Large Scale Scientific Workflows
OEG tools for supporting Ontology Engineering
Reproducibility Using Semantics: An Overview
PhD Thesis: Mining abstractions in scientific workflows
Publicación de datos y métodos científicos en investigación
EDBT 2015: Summer School Overview
Similarity in Wikipedia Articles (EDBT Summer School)
Semantic web 101: Benefits for geologists
Is preserving data enough? Towards the preservation of scientific methods
Creating abstractions from scientific workflows: PhD symposium 2015
Towards Workflow Ecosystems Through Semantic and Standard Representations
Workflow Reuse in Practice: A Study of Neuroimaging Pipeline Users
Frag Flow: Automated Fragment Detection in Scientific Workflows
User requirments for geospatial provenance

Recently uploaded (20)

PPTX
history of c programming in notes for students .pptx
PPT
Introduction Database Management System for Course Database
PPTX
ISO 45001 Occupational Health and Safety Management System
PDF
System and Network Administration Chapter 2
PDF
PTS Company Brochure 2025 (1).pdf.......
PDF
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
PPTX
Odoo POS Development Services by CandidRoot Solutions
PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PDF
Complete React Javascript Course Syllabus.pdf
PPTX
Essential Infomation Tech presentation.pptx
PPTX
Transform Your Business with a Software ERP System
PDF
How to Choose the Right IT Partner for Your Business in Malaysia
PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
PDF
top salesforce developer skills in 2025.pdf
PDF
AI in Product Development-omnex systems
PPTX
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
PDF
Digital Strategies for Manufacturing Companies
PPTX
L1 - Introduction to python Backend.pptx
PDF
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
PDF
How to Migrate SBCGlobal Email to Yahoo Easily
history of c programming in notes for students .pptx
Introduction Database Management System for Course Database
ISO 45001 Occupational Health and Safety Management System
System and Network Administration Chapter 2
PTS Company Brochure 2025 (1).pdf.......
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
Odoo POS Development Services by CandidRoot Solutions
Adobe Illustrator 28.6 Crack My Vision of Vector Design
Complete React Javascript Course Syllabus.pdf
Essential Infomation Tech presentation.pptx
Transform Your Business with a Software ERP System
How to Choose the Right IT Partner for Your Business in Malaysia
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
top salesforce developer skills in 2025.pdf
AI in Product Development-omnex systems
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
Digital Strategies for Manufacturing Companies
L1 - Introduction to python Backend.pptx
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
How to Migrate SBCGlobal Email to Yahoo Easily

OKG-Soft: An Open Knowledge Graph With Mathine Readable Scientific Software Metadata

  • 1. http://mint-project.info OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA Daniel Garijo, Maximiliano Osorio, Deborah Khider, Varun Ratnakar and Yolanda Gil University of Southern California, Information Sciences Institute @dgarijov OKN-AKBC May 22nd,Amherst, USAInformation Sciences Institute
  • 2. http://mint-project.info Science is changing: Open Science 2 Open publications Open data Open access Open source software Impact and credit OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019
  • 3. http://mint-project.info The importance of Scientific Software 3 Open publications Open data Open source software • Software helps understand data • Provenance, reproducibility • Software helps understanding methods • Assumptions, limitations OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019
  • 4. http://mint-project.info Why is it difficult to reuse Scientific Software? Software A Hydrology Model Weather DEM Infiltration Outflow Error FLDAS (climate) Remote sensing Let’s imagine we want to reuse existing work: Software B Map-based visualizations ? ? ? How do I to transform data? How do I invoke code? How do I interpret results? OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019 4
  • 5. http://mint-project.info Outline 5 1. Requirements help scientific software reusability 2. Our current approach for representing scientific software metadata 3. A framework to query, explore, exploit and publish software metadata OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019
  • 6. http://mint-project.info Outline 6 1. Requirements help scientific software reusability 2. Our current approach for representing scientific software metadata 3. A framework to query, explore, exploit and publish software metadata OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019
  • 7. http://mint-project.info Requirements for Software Reusability 7 1. Exposing software inputs, outputs and their corresponding variables Hydrology Software Model OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019 Weather DEM Infiltration Outflow Error Input1 Input2 Input3 Output1 Output2 - Land surface temperature (degC) - Precipitation rate (mm/h) - Land surface wind speed (m/day) - Net radiation (MJ/(day m^2)
  • 8. http://mint-project.info Requirements for Software Reusability 8 1. Exposing software inputs, outputs and their corresponding variables 2. Capturing the functions of the software being used OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019 Hydrology Software Model Function A: Richards Equation for water movement (unsat soil) Function B: Saint Venant equations (shallow water)
  • 9. http://mint-project.info Requirements for Software Reusability 9 1. Exposing software inputs, outputs and their corresponding variables 2. Capturing the functions of software being used 3. Using principled ontologies with structured names for model variables, processes, and methods OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019 Temp T T_C svo:land_surface_ air__temperature
  • 10. http://mint-project.info Requirements for Software Reusability 10 1. Exposing software inputs, outputs and their corresponding variables 2. Capturing the functions of software being used 3. Using principled ontologies with structured names for model variables, processes, and methods 4. Capture the semantic structure of software invocations OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019 Dependencies? Sample runs? Invocation command? Is data supposed to be in the same folder? Default arguments/Configuration files? Volumes? Do I have to log in in the image
  • 11. http://mint-project.info Outline 11 1. Requirements help scientific software reusability 2. Our current approach for representing scientific software metadata 3. A framework to query, explore, exploit and publish software metadata OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019
  • 12. http://mint-project.info Prior Work: OntoSoft Software Metadata Registry 12 OntoSoft Model and Software Metadata Registry • Complements code repositories to make them understandable • Software metadata designed for scientists • Metadata is curated by decentralized communities of users • Training scientists on best practices http://ontosoft.org Finding Software OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019 [Gil et al 2015]: OntoSoft: Capturing Scientific Software Metadata Eighth ACM International Conference on Knowledge Capture, Palisades, NY, 2015
  • 13. http://mint-project.info Prior Work: OntoSoft Software Metadata Registry 13 OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019 PIHM PIHMgis DrEICH TauDEM WBMsed
  • 14. http://mint-project.info Evolving OntoSoft: Software Description Ontology https://w3id.org/okn/o/sd# Extensions: • Schema.org (software metadata) • W3C Data Cubes (Contents of inputs and outputs) • NASA QUDT (Units) • DockerPedia (Software images) • Scientific Variables Ontology (Standard Variables) OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019 14
  • 15. http://mint-project.info Evolving OntoSoft: Extending schema.org and Codemeta https://w3id.org/okn/o/sd# OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019 15
  • 16. http://mint-project.info https://w3id.org/okn/o/sd# Extensions: • Schema.org (software metadata) • W3C Data Cubes (Contents of inputs and outputs) • NASA QUDT (Units) • DockerPedia (Software images) • Scientific Variables Ontology (Standard Variables) OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019 16 Evolving OntoSoft: Software Description Ontology
  • 17. http://mint-project.info Describing Input/Output files, parameters and variables 17 M. Stoica and S. D. Peckham, “An Ontology Blueprint for Constructing Qualitative and Quantitative Scientific Variables,” in Proceedings of the ISWC 2018 Posters & Demonstrations, Industry and Blue Sky Ideas Tracks co-located with 17th International Semantic Web Conference (ISWC 2018), Monterey, USA, October 8th - to - 12th, 2018., 2018 Scientific Variables Ontology identifiers
  • 18. http://mint-project.info https://w3id.org/okn/o/sd# Extensions: • Schema.org (software metadata) • W3C Data Cubes (Contents of inputs and outputs) • NASA QUDT (Units) • DockerPedia (Software images) • Scientific Variables Ontology (Standard Variables) OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019 18 Evolving OntoSoft: Software Description Ontology
  • 19. http://mint-project.info Machine readable representation of units 19 “RAIN” mm/day CCUT Linking and augmenting from Wikidata • B. Shbita, A. Rajendran, J. Pujara, and C. Knoblock, Parsing, Representing and Transforming Units of Measure, in Modeling the World's Systems, 2019.
  • 20. http://mint-project.info https://w3id.org/okn/o/sd# Extensions: • Schema.org (software metadata) • W3C Data Cubes (Contents of inputs and outputs) • NASA QUDT (Units) • DockerPedia (Software images) • Scientific Variables Ontology (Standard Variables) OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019 20 Evolving OntoSoft: Describing Containers
  • 21. http://mint-project.infoOKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019 21 Evolving OntoSoft: Describing Containers “mintproject/ pihm2cycles” Docker pedia Image tag Image semantic representation https://dockerpedia.inf.utfsm.cl/ • M. Osorio, H. Vargas, and C. Buil Aranda, “DockerPedia: a Knowledge Graph of Docker Images,” in Proceedings of the ISWC 2018 Posters & Demonstrations, Industry and Blue Sky Ideas Tracks co-located with 17th International Semantic Web Conference (ISWC 2018), Monterrey, 2018.
  • 22. http://mint-project.info Outline 22 1. Requirements help scientific software reusability 2. Our current approach for representing scientific software metadata 3. A framework to query, explore, exploit and publish software metadata OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019
  • 23. http://mint-project.info OKG-SOFT: Framework 23 Software Model Catalog contains: • Models from hydrology, agriculture and economy, their versions and model configurations. • More than 200 variables mapped to SVO. • All models are executable through scientific workflows • Most contents are added manually (expert users) collaboratively • Automated unit transformations • Automated software image description • Semi-automated Wikidata linking OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019 https://query.mint.isi.edu/api/mintproject/MINT-ModelCatalogQueries#/ APIs: • SPARQL endpoint • REST APIs (GET/POST) • Python clients
  • 24. http://mint-project.info Exploitation: Exploring Scientific Software Model Metadata 24http://models.mint.isi.edu Explore variables OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019 Explore Software I/O Find Software Models
  • 25. http://mint-project.info Exploitation: Comparing Scientific Software Models 25 http://models.mint.isi.edu OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019
  • 26. http://mint-project.info Exploitation: Towards Automated Software Composition 26 OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019
  • 27. http://mint-project.info Summary 27 Scientific software reusability is crucial to understand • Existing data • Published methods 1. Requirements for scientific software reusability include • Expose inputs, outputs, variables and software invocation details! 2. Our approach for capturing and structuring scientific software 3. A framework to query, explore, exploit and publish software metadata OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019
  • 28. http://mint-project.info Help us making your software more reusable 28 We would like to thank Scott Peckham, Maria Stoica, Chris Duffy, Lele Shu, Kelly Cobourn, Zeya Zhang Suzanne Pierce, Armen Kemanian, Rajiv Mayani, Jay Puajara, Basel Shbita, Dhruv Pattel, Rohit Mayura, Amrish Goel and Anuj Doiphode OKG-SOFT: AN OPEN KNOWLEDGE GRAPH WITH MACHINE READABLE SCIENTIFIC SOFTWARE METADATA –ESCIENCE 2019 Contact me: dgarijo@isi.edu