SlideShare a Scribd company logo
Daniel Jacob – INRA - 2018
How to ensure that open data
works for research
Make your data great again
Daniel Jacob
INRA UMR 1332 BFP – Metabolism Group
Bordeaux Metabolomics Facility
Oct 2018
https://fr.slideshare.net/danieljacob771282/make-your-data-great-now
following
Give an open access to your data
and make them ready to be mined
Open Data for Access and Mining
ODAM Framework
Daniel Jacob – INRA - 2018
Develop if needed, lightweight tools
- R scripts, lightweight GUI (R shiny)
Minimal effort, Maximal efficiency
…
Use existing tools
- Spreadsheets, R studio,
BioStatFlow, Galaxy,
Cytoscape, …
Data
Format
TSV
EDTMS
ODAM
F
A
INTEROPERABLE
R
Experiment
Data Tables
2 metadata files
+
Research question  Project  Experiment  Experimental set-up
 Data emancipation
regarding Tools
Data API  Tools
DataTools
https://fr.slideshare.net/danieljacob771282/make-your-data-great-now
following
Daniel Jacob – INRA - 2018
Develop if needed, lightweight tools
- R scripts, lightweight GUI (R shiny)
…
Use existing tools
- Spreadsheets, R studio,
BioStatFlow, Galaxy,
Cytoscape, …
Data
Format
TSV
Multi-species
Data Integration
Data integration
Towards Linked Data
Phenotype Information System
EDTMS
ODAM
F
A
INTEROPERABLE
R
« Plant Physiology and Metabolism»
https://www.quora.com/What-is-plant-physiology-and-metabolism
« Plant Growth»
Daniel Jacob – INRA - 2018
http://cgi.di.uoa.gr/~pms509/past_lectures/introduction-to-rdf.pdf
EDTMS
ODAM Resource Description Framework (RDF)
Daniel Jacob – INRA - 2018
s_subsets.tsv This metadata file allows to associate a key concept to each data subset file
Creation of the metadata files - Subsets
EDTMS
ODAM
Optional:
an annotation based on
ontology
CV Term
X
…
Optional:
an annotation based
on ontology
Plants
Harvests
Samples
Compounds
…
a_attributes.csv This metadata file allows each attribute (variable) to be annotated with some minimal but relevant metadata
CV Term
X
Resource Description Framework (RDF)
Daniel Jacob – INRA - 2018
Data / Metadata
Entities
Attributes
categories
subsets CV Term
s_subsets.tsv
a_attributes.tsv
CV Term ?
attributes CV Term
EDTMS
ODAM Resource Description Framework (RDF)
Daniel Jacob – INRA - 2018
Data / Metadata
Entities
Attributes
attributes CV Term
subsets CV Term
s_subsets.tsv
a_attributes.tsv
CV Term
Entity + Attribute = Trait
Trait (characteristic / feature)
categories
EDTMS
ODAM Resource Description Framework (RDF)
Daniel Jacob – INRA - 2018
TO
Plant Trait
Ontology EO
Plant Env.
Ontology
PO
Plant
Structure &
Dev. Stage
Ontology
CHEBI
Ontology
GO
Ontology
…
TO
EO
PO
Entity + Attribute = Trait
Trait (characteristic / feature)
Plant Trait Ontology
as the core / kernel of all ontologies
http://agroportal.lirmm.fr/ontologies
EDTMS
ODAM Resource Description Framework (RDF)
« Plant Physiology and Metabolism»
« Plant Growth»
Daniel Jacob – INRA - 2018
factor
quantitative
qualitative
identifier
categories
Plants
Compounds
Enzymes
Harvests
Samples
plants.tsv
PlanteID
harvests.tsv
Lot samples.tsv
SampleID
compounds.tsv
enzymes.tsv
SampleID
SampleID
Entities
TO
Plant Trait
Ontology
EO
Plant Env.
Ontology
PO
Plant Structure &
Dev. Stage
Ontology
GO
Ontology
CHEBI
Ontology
…
Attributes CV Term
CV Term
CV Term
http://agroportal.lirmm.fr/ontologies
CV Term
EDTMS
ODAM
a TBox is a "terminological component“
a conceptualization associated with a set of facts
TBox
Reference ontologies
Resource Description Framework (RDF)
Daniel Jacob – INRA - 2018
Data / Metadata
Category CV Term
Entities
Attributes
Typical queries:
Search for a particular Trait
Entity + Attribute = Trait
CV Term
Attribute Subset
CV Term
Category Species
EDTMS
ODAM
an ABox is an "assertion component“
a fact associated with a conceptual model or ontologies within a knowledge base.
ABox
Application ontologies
Resource Description Framework (RDF)
Daniel Jacob – INRA - 2018
factor
quantitative
qualitative
identifier
rdfs:range
categories
For each
Dataset
RDF
Schema
rdfs:label
<description>
rdfs:label
<description>
#description
Attributes Subsets
attribute
node
subset
node
rdf:type rdf:type
rdf:Bag
xsd:stringxsd:string
Attribute Entity
#hasEntity
#hasAttribute
Category Species
#hasCategory #hasSpecies
#description
#hasCategory
xsd:string
TO
EO
PO
CHE
BI
GO
…
Taxo
n
rdf:resource
rdf:resource
…
xsd:string
rdf:resource
CV Term
Abox - Application ontologies
Tbox - Reference ontologies
EDTMS
ODAM
https://schema.org/Dataset
measurementTechniquevariableMeasured
Resource Description Framework (RDF)
Daniel Jacob – INRA - 2018
Category CV Term
Entities
Attributes
Data / Metadata
Traits
Values
Phenotype (observed)
=
Traits + Values
Towards a Phenotype Information System
Automatic populating of the knowledge base
from the metadata files
defined within ODAM data subsets
Attributes Subsets
attribute
node
subset
node
rdf:type rdf:type
Attribute Entity
#hasEntity
#hasAttribute
Category Species
#hasCategory #hasSpecie
s
EDTMS
ODAM
Daniel Jacob – INRA - 2018
Fruit + weight = Fruit weightTrait
Constraint
and
Species = Tomato
Typical queries:
Search for a particular Trait
with or without Constraints
hasSynonym Tomato
Towards a Phenotype Information System
Attributes
Entities
EDTMS
ODAM
Daniel Jacob – INRA - 2018
Fruit + weight = Fruit weightTrait
Constraint
and
Species = Tomato
Typical queries:
Search for a particular Trait
with or without Constraints
Phenotype (observed)
=
(Entity + Attribute) + Values
Towards a Phenotype Information SystemEDTMS
ODAM
Daniel Jacob – INRA - 2018
Category CV Term
Entities
Attributes
Data mapping
Values
Data capture
EDTMS
Entity + Attribute = Trait
Trait (characteristic / feature)
Attributes Subsets
attribute
node
subset
node
rdf:type rdf:type
Attribute Entity
#hasEntity
#hasAttribute
Category Species
#hasCategory #hasSpecies
Data linking
Develop if needed, lightweight tools
- R scripts (Galaxy), lightweight GUI (R shiny)
EDTMS
ODAM
Daniel Jacob – INRA - 2018
Category CV Term
Entities
Attributes
Data mapping
Values
Data capture
EDTMS
Phenotype
(observed)
=
Traits + Values
Data Exploration
Entity + Attribute = Trait
Trait (characteristic / feature)
Towards a Phenotype
Information System
Attributes Subsets
attribute
node
subset
node
rdf:type rdf:type
Attribute Entity
#hasEntity
#hasAttribute
Category Species
#hasCategory #hasSpecies
Data linking
Data = Phenotypic data +
Molecular data +
Environment data
Phenotypic metadata =
Descriptors of Traits
(Entity-Attribute) +
Environment Factors
Data accumulation

Knowledge Base
EDTMS
ODAM
Daniel Jacob – INRA - 2018
Bayes' theorem, the general formula:
y : data  : parameters
[ y,  ] = [ y |  ].[ ] = [ | y].[y]
Where [.] means a density or a probability
Posterior density
or simply the so-
called “posterior”
Prior density of  or simply the
so-called “prior”
Likelihood (function of  )
Marginal density
(data, model)
Model-Based Bayesian Inference:
Data mining
Phenotype
Information
System
Ex : model for
phenotypic variance and
biomass prediction (Y)
based on environmental
parameters ( )
Machine
Learning
« Plant Growth»
Daniel Jacob – INRA - 2018
Make your data great again
 Metadata : not just on the "top"
linked to datasets but more
deeply linked to the variables.
The data management system becomes completely
independent of data usage.
One dataset  Several applications
&
One application  Several datasets
Making open data work for research
Data accumulation

Knowledge Base
 Keep data “alive” into the data process loop
 to similar way as for DNA/Protein
sequences where sequences can be
integrated into annotation pipelines.
Machine Learning
Model-Based Bayesian Inference:

More Related Content

PPTX
A guided tour of Araport
PDF
Plant ontology web services on Araport
PPTX
FAIRer Research
DOCX
2016 Summer - Araport Project Overview Leaflet
PDF
FAIRness through a novel combination of Web technologies
PDF
HRGRN: enabling graph search and integrative analysis of Arabidopsis signalin...
PDF
It Takes a Village to Grow ORCIDs on Campus: Establishing and Integrating Uni...
PPTX
Module development
A guided tour of Araport
Plant ontology web services on Araport
FAIRer Research
2016 Summer - Araport Project Overview Leaflet
FAIRness through a novel combination of Web technologies
HRGRN: enabling graph search and integrative analysis of Arabidopsis signalin...
It Takes a Village to Grow ORCIDs on Campus: Establishing and Integrating Uni...
Module development

What's hot (20)

PDF
PMR metabolomics and transcriptomics database and its RESTful web APIs: A dat...
PDF
Tripal within the Arabidopsis Information Portal - PAG XXIII
PPTX
2015 09 rda-pre-meeting_jk
PPTX
FAIR Projector Builder
PDF
FOOPS!: An Ontology Pitfall Scanner for the FAIR principles
PPTX
Tech. session : Interoperability and Data FAIRness emerges from a novel combi...
PPTX
Publishing and Consuming FAIR Data A Case in the Agri-Food Domain
PPTX
Research Objects, SEEK and FAIRDOM
PDF
SWAT4LS 2014 SLIDE by Yamamoto
PDF
ICAR 2015 Workshop - Agnes Chan
PPTX
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
PPTX
FAIR Computational Workflows
PPTX
Towards Knowledge Graphs of Reusable Research Software Metadata
PPTX
Software Sustainability: Better Software Better Science
PPTX
FAIR Workflows and Research Objects get a Workout
PDF
Neo4j and bioinformatics
PPTX
Towards Reusable Research Software
PPTX
Kampmeier ecn 2012
PDF
2015 Summer - Araport Project Overview Leaflet
PPTX
Vaughn aip walkthru_pag2015
PMR metabolomics and transcriptomics database and its RESTful web APIs: A dat...
Tripal within the Arabidopsis Information Portal - PAG XXIII
2015 09 rda-pre-meeting_jk
FAIR Projector Builder
FOOPS!: An Ontology Pitfall Scanner for the FAIR principles
Tech. session : Interoperability and Data FAIRness emerges from a novel combi...
Publishing and Consuming FAIR Data A Case in the Agri-Food Domain
Research Objects, SEEK and FAIRDOM
SWAT4LS 2014 SLIDE by Yamamoto
ICAR 2015 Workshop - Agnes Chan
OBA: An Ontology-Based Framework for Creating REST APIs for Knowledge Graphs
FAIR Computational Workflows
Towards Knowledge Graphs of Reusable Research Software Metadata
Software Sustainability: Better Software Better Science
FAIR Workflows and Research Objects get a Workout
Neo4j and bioinformatics
Towards Reusable Research Software
Kampmeier ecn 2012
2015 Summer - Araport Project Overview Leaflet
Vaughn aip walkthru_pag2015
Ad

Similar to Make your data great again - Ver 2 (20)

PPTX
Odam: Open Data, Access and Mining
PPTX
Make your data great now
PDF
Webinar@AIMS: INRA's Big Data Perspectives and Implementation Challenges
PDF
Dinesh Barupal @ California Biomonitoring SGP Meeting July 2020
PPTX
Martone grethe
PDF
Indexator_oct2022.pdf
PPTX
BioSHaRE: Making data useful without direct sharing: Cafe Variome and Omics b...
PDF
SC2 Workshop 1: INRA's Big Data perspectives and implementation challenges
PPT
Resource Description Framework Approach to Data Publication and Federation
PPTX
Exploring Data (1).pptx
PPTX
How to make your published data findable, accessible, interoperable and reusable
PDF
Plant Bioinformatics Methods and Protocols 2nd Edition David Edwards (Eds.)
PDF
Scientific Data overview of Data Descriptors - WT Data-Literature integration...
PDF
NIH BD2K DataMed model, DATS
PDF
NetBioSIG2013-Talk Gang Su
PDF
Final Acb All Hands 26 11 07.Key
PPT
17329274.ppt
PDF
Introduction to 16S rRNA gene multivariate analysis
PDF
From data to knowledge – the Ondex System for integrating Life Sciences data ...
PPTX
E.Gombocz: Semantics in a Box (SemTech 2013-04-30)
Odam: Open Data, Access and Mining
Make your data great now
Webinar@AIMS: INRA's Big Data Perspectives and Implementation Challenges
Dinesh Barupal @ California Biomonitoring SGP Meeting July 2020
Martone grethe
Indexator_oct2022.pdf
BioSHaRE: Making data useful without direct sharing: Cafe Variome and Omics b...
SC2 Workshop 1: INRA's Big Data perspectives and implementation challenges
Resource Description Framework Approach to Data Publication and Federation
Exploring Data (1).pptx
How to make your published data findable, accessible, interoperable and reusable
Plant Bioinformatics Methods and Protocols 2nd Edition David Edwards (Eds.)
Scientific Data overview of Data Descriptors - WT Data-Literature integration...
NIH BD2K DataMed model, DATS
NetBioSIG2013-Talk Gang Su
Final Acb All Hands 26 11 07.Key
17329274.ppt
Introduction to 16S rRNA gene multivariate analysis
From data to knowledge – the Ondex System for integrating Life Sciences data ...
E.Gombocz: Semantics in a Box (SemTech 2013-04-30)
Ad

Recently uploaded (20)

PDF
Sciences of Europe No 170 (2025)
PPT
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
PDF
diccionario toefl examen de ingles para principiante
PDF
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
PPTX
DRUG THERAPY FOR SHOCK gjjjgfhhhhh.pptx.
PPTX
GEN. BIO 1 - CELL TYPES & CELL MODIFICATIONS
PDF
IFIT3 RNA-binding activity primores influenza A viruz infection and translati...
PPTX
Vitamins & Minerals: Complete Guide to Functions, Food Sources, Deficiency Si...
PPTX
Taita Taveta Laboratory Technician Workshop Presentation.pptx
PPTX
Protein & Amino Acid Structures Levels of protein structure (primary, seconda...
PPTX
TOTAL hIP ARTHROPLASTY Presentation.pptx
PPTX
Comparative Structure of Integument in Vertebrates.pptx
PPTX
Derivatives of integument scales, beaks, horns,.pptx
PPTX
Introduction to Fisheries Biotechnology_Lesson 1.pptx
PPTX
Cell Membrane: Structure, Composition & Functions
PDF
The scientific heritage No 166 (166) (2025)
PDF
. Radiology Case Scenariosssssssssssssss
PDF
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
PDF
HPLC-PPT.docx high performance liquid chromatography
PDF
AlphaEarth Foundations and the Satellite Embedding dataset
Sciences of Europe No 170 (2025)
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
diccionario toefl examen de ingles para principiante
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
DRUG THERAPY FOR SHOCK gjjjgfhhhhh.pptx.
GEN. BIO 1 - CELL TYPES & CELL MODIFICATIONS
IFIT3 RNA-binding activity primores influenza A viruz infection and translati...
Vitamins & Minerals: Complete Guide to Functions, Food Sources, Deficiency Si...
Taita Taveta Laboratory Technician Workshop Presentation.pptx
Protein & Amino Acid Structures Levels of protein structure (primary, seconda...
TOTAL hIP ARTHROPLASTY Presentation.pptx
Comparative Structure of Integument in Vertebrates.pptx
Derivatives of integument scales, beaks, horns,.pptx
Introduction to Fisheries Biotechnology_Lesson 1.pptx
Cell Membrane: Structure, Composition & Functions
The scientific heritage No 166 (166) (2025)
. Radiology Case Scenariosssssssssssssss
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
HPLC-PPT.docx high performance liquid chromatography
AlphaEarth Foundations and the Satellite Embedding dataset

Make your data great again - Ver 2

  • 1. Daniel Jacob – INRA - 2018 How to ensure that open data works for research Make your data great again Daniel Jacob INRA UMR 1332 BFP – Metabolism Group Bordeaux Metabolomics Facility Oct 2018 https://fr.slideshare.net/danieljacob771282/make-your-data-great-now following Give an open access to your data and make them ready to be mined Open Data for Access and Mining ODAM Framework
  • 2. Daniel Jacob – INRA - 2018 Develop if needed, lightweight tools - R scripts, lightweight GUI (R shiny) Minimal effort, Maximal efficiency … Use existing tools - Spreadsheets, R studio, BioStatFlow, Galaxy, Cytoscape, … Data Format TSV EDTMS ODAM F A INTEROPERABLE R Experiment Data Tables 2 metadata files + Research question  Project  Experiment  Experimental set-up  Data emancipation regarding Tools Data API  Tools DataTools https://fr.slideshare.net/danieljacob771282/make-your-data-great-now following
  • 3. Daniel Jacob – INRA - 2018 Develop if needed, lightweight tools - R scripts, lightweight GUI (R shiny) … Use existing tools - Spreadsheets, R studio, BioStatFlow, Galaxy, Cytoscape, … Data Format TSV Multi-species Data Integration Data integration Towards Linked Data Phenotype Information System EDTMS ODAM F A INTEROPERABLE R « Plant Physiology and Metabolism» https://www.quora.com/What-is-plant-physiology-and-metabolism « Plant Growth»
  • 4. Daniel Jacob – INRA - 2018 http://cgi.di.uoa.gr/~pms509/past_lectures/introduction-to-rdf.pdf EDTMS ODAM Resource Description Framework (RDF)
  • 5. Daniel Jacob – INRA - 2018 s_subsets.tsv This metadata file allows to associate a key concept to each data subset file Creation of the metadata files - Subsets EDTMS ODAM Optional: an annotation based on ontology CV Term X … Optional: an annotation based on ontology Plants Harvests Samples Compounds … a_attributes.csv This metadata file allows each attribute (variable) to be annotated with some minimal but relevant metadata CV Term X Resource Description Framework (RDF)
  • 6. Daniel Jacob – INRA - 2018 Data / Metadata Entities Attributes categories subsets CV Term s_subsets.tsv a_attributes.tsv CV Term ? attributes CV Term EDTMS ODAM Resource Description Framework (RDF)
  • 7. Daniel Jacob – INRA - 2018 Data / Metadata Entities Attributes attributes CV Term subsets CV Term s_subsets.tsv a_attributes.tsv CV Term Entity + Attribute = Trait Trait (characteristic / feature) categories EDTMS ODAM Resource Description Framework (RDF)
  • 8. Daniel Jacob – INRA - 2018 TO Plant Trait Ontology EO Plant Env. Ontology PO Plant Structure & Dev. Stage Ontology CHEBI Ontology GO Ontology … TO EO PO Entity + Attribute = Trait Trait (characteristic / feature) Plant Trait Ontology as the core / kernel of all ontologies http://agroportal.lirmm.fr/ontologies EDTMS ODAM Resource Description Framework (RDF) « Plant Physiology and Metabolism» « Plant Growth»
  • 9. Daniel Jacob – INRA - 2018 factor quantitative qualitative identifier categories Plants Compounds Enzymes Harvests Samples plants.tsv PlanteID harvests.tsv Lot samples.tsv SampleID compounds.tsv enzymes.tsv SampleID SampleID Entities TO Plant Trait Ontology EO Plant Env. Ontology PO Plant Structure & Dev. Stage Ontology GO Ontology CHEBI Ontology … Attributes CV Term CV Term CV Term http://agroportal.lirmm.fr/ontologies CV Term EDTMS ODAM a TBox is a "terminological component“ a conceptualization associated with a set of facts TBox Reference ontologies Resource Description Framework (RDF)
  • 10. Daniel Jacob – INRA - 2018 Data / Metadata Category CV Term Entities Attributes Typical queries: Search for a particular Trait Entity + Attribute = Trait CV Term Attribute Subset CV Term Category Species EDTMS ODAM an ABox is an "assertion component“ a fact associated with a conceptual model or ontologies within a knowledge base. ABox Application ontologies Resource Description Framework (RDF)
  • 11. Daniel Jacob – INRA - 2018 factor quantitative qualitative identifier rdfs:range categories For each Dataset RDF Schema rdfs:label <description> rdfs:label <description> #description Attributes Subsets attribute node subset node rdf:type rdf:type rdf:Bag xsd:stringxsd:string Attribute Entity #hasEntity #hasAttribute Category Species #hasCategory #hasSpecies #description #hasCategory xsd:string TO EO PO CHE BI GO … Taxo n rdf:resource rdf:resource … xsd:string rdf:resource CV Term Abox - Application ontologies Tbox - Reference ontologies EDTMS ODAM https://schema.org/Dataset measurementTechniquevariableMeasured Resource Description Framework (RDF)
  • 12. Daniel Jacob – INRA - 2018 Category CV Term Entities Attributes Data / Metadata Traits Values Phenotype (observed) = Traits + Values Towards a Phenotype Information System Automatic populating of the knowledge base from the metadata files defined within ODAM data subsets Attributes Subsets attribute node subset node rdf:type rdf:type Attribute Entity #hasEntity #hasAttribute Category Species #hasCategory #hasSpecie s EDTMS ODAM
  • 13. Daniel Jacob – INRA - 2018 Fruit + weight = Fruit weightTrait Constraint and Species = Tomato Typical queries: Search for a particular Trait with or without Constraints hasSynonym Tomato Towards a Phenotype Information System Attributes Entities EDTMS ODAM
  • 14. Daniel Jacob – INRA - 2018 Fruit + weight = Fruit weightTrait Constraint and Species = Tomato Typical queries: Search for a particular Trait with or without Constraints Phenotype (observed) = (Entity + Attribute) + Values Towards a Phenotype Information SystemEDTMS ODAM
  • 15. Daniel Jacob – INRA - 2018 Category CV Term Entities Attributes Data mapping Values Data capture EDTMS Entity + Attribute = Trait Trait (characteristic / feature) Attributes Subsets attribute node subset node rdf:type rdf:type Attribute Entity #hasEntity #hasAttribute Category Species #hasCategory #hasSpecies Data linking Develop if needed, lightweight tools - R scripts (Galaxy), lightweight GUI (R shiny) EDTMS ODAM
  • 16. Daniel Jacob – INRA - 2018 Category CV Term Entities Attributes Data mapping Values Data capture EDTMS Phenotype (observed) = Traits + Values Data Exploration Entity + Attribute = Trait Trait (characteristic / feature) Towards a Phenotype Information System Attributes Subsets attribute node subset node rdf:type rdf:type Attribute Entity #hasEntity #hasAttribute Category Species #hasCategory #hasSpecies Data linking Data = Phenotypic data + Molecular data + Environment data Phenotypic metadata = Descriptors of Traits (Entity-Attribute) + Environment Factors Data accumulation  Knowledge Base EDTMS ODAM
  • 17. Daniel Jacob – INRA - 2018 Bayes' theorem, the general formula: y : data  : parameters [ y,  ] = [ y |  ].[ ] = [ | y].[y] Where [.] means a density or a probability Posterior density or simply the so- called “posterior” Prior density of  or simply the so-called “prior” Likelihood (function of  ) Marginal density (data, model) Model-Based Bayesian Inference: Data mining Phenotype Information System Ex : model for phenotypic variance and biomass prediction (Y) based on environmental parameters ( ) Machine Learning « Plant Growth»
  • 18. Daniel Jacob – INRA - 2018 Make your data great again  Metadata : not just on the "top" linked to datasets but more deeply linked to the variables. The data management system becomes completely independent of data usage. One dataset  Several applications & One application  Several datasets Making open data work for research Data accumulation  Knowledge Base  Keep data “alive” into the data process loop  to similar way as for DNA/Protein sequences where sequences can be integrated into annotation pipelines. Machine Learning Model-Based Bayesian Inference:

Editor's Notes

  • #9: Trait vs Phenotype Entity + Attribute = Trait (observable) Entity + (Attribute + Value) = Phenotype (observed)
  • #10: an ABox is an "assertion component"—a fact associated with a terminological vocabulary within a knowledge base
  • #11: TBox statements describe a system in terms of controlled vocabularies, for example, a set of classes and properties. ABox are TBox-compliant statements about that vocabulary.
  • #12: Questions types: Quel est l’ensemble des “Traits” (quantitative/qualitative) pour un échantillon (identifiant) donné ? Quel est l’ensemble des “Traits” (quantitative/qualitative) pour un ou plusieurs CV donnés { type de subsets: ex: CV subset in (metabolite,enzyme)(CHEBI) ; type d’attribut: ex CV attribute ==tissue == “fruit pericarp” (PO) }, avec ou sans contrainte suppl. Ex : type de factor