SlideShare a Scribd company logo
The Past, Present and Future of
Knowledge in Biology
Robert Stevens
BioHealth Informatics Group
The University of Manchester
Manchester
United Kingdom
Robert.Stevens@manchester.ac.uk
Overview
• A look at the state of play
• For what are we using ontologies?
• What do we count as knowledge?
• Doing so much more with knowledge
• Stopping text being a dead end
Text and Ontologies: The Terrible
Twins of Knowledge in Biology
Robert Stevens
BioHealth Informatics Group
The University of Manchester
Manchester
United Kingdom
Robert.Stevens@manchester.ac.uk
Biology now has lots of facts
Genome
Proteome
Transcriptome
Interactome
Metabolome
PHENOME
Lots of catalogues
Data are only as Good as their
Metadata
• There is a lot of biology out there…
• How these entities are described in our data varies
• We don’t even agree on what entities there are to
describe in our data
• This makes analysing data hard: You have to know
what your data represent
• …, but also how the entities described in your data
relate to each other
• We need to describe our data – their metadata
Creating Woods, not Trees
Genes
Proteins
Pathways
Interactions
Literature
Complex
Machines
Virtual
Organism
…. from biological facts, we make a system that is some model of a real organism
Timeline
There’s a Lot of it About
Searching for “ontology” in five
year chunks on the ACM digital
portal
Searching for “ontology” in five
year chunks on the ACM digital
portal
Searching for “ontology” in five
year chunks on PubMed
Searching for “ontology” in five
year chunks on PubMed
It’s all Gruber’s Fault
• “In the context of knowledge sharing, the term ontology means a
specification of a conceptualisation. That is, an ontology is a description
(like a formal specification of a program) of the concepts and relationships
that can exist for an agent or a community of agents. This definition is
consistent with the usage of ontology as set-of-concept-definitions, but
more general. And it is certainly a different sense of the word than its use
in philosophy.” DOI:10.1006/knac.1993.1008 DOI:10.1006/ijhc.1995.1081
Angels on the head of a pin
Everything with a Blob and Line is
called an Ontology
• Wide acceptance criteria
• Narrow evaluation criteria
• Different sort of knowledge for different
situations
• Different styles of representation; some
scruffy and some formal
• Representing knowledge in biology is more
than ontologies
• We could stop calling them ontologies
RDF
graph
RDF
graph
Database
schema
Database
schema
ThesaurusThesaurus
OWL
Ontology
OWL
Ontology
Formal
ontology
Formal
ontology
SKOS
vocabulary
SKOS
vocabulary
Uses of Ontologies
Knowing What We’ve got is so
Useful
• We could computationally handle lots of data,
but we couldn’t do so with what we know
about those data
• Ontologies so far mainly used for a common
tongue so that we can compare
• … and it works!
• Still getting lots of mileage from ontology
annotation
• …, But there is so much more
GENERIC GENE ONTOLOGY (GO)
TERM FINDERS000003093
MXR1
YPL250C
S000004294
SAM3
YIR017C
S000003152
MMP1
MET1
Expressed
Genes
P-value score
http://go.princeton.edu/cgi-bin/GOTermFinder
Classifying a Mouse
Individual Description:
Stops wriggling after 3 sec
Has 3 cm tail
Mass 10g
10 days old (since birth)
Strain C57Bl/6
Class Description:
Class:DepressedMouse
EquivalentTo:Mouse that
(wriggles For <=30 OR swims for <=45)
DataTransformation
Short tailed mouse
Class:ShortTailedMouse
EquivalentTo:Mouse that
hasPart EXACTLY 1 (Tail that hasAssay SOME
(LengthAssay that hasValue SOME int[<= 20) and hasUnit
SOME Millimetre))
SubClassOf: Mouse that
hasPart some (Tail that hasQuality SOME Short)
• We can recognise an instance of short-
tailed mouse, but we also know that it has
the quality “short”
• Even when the fact isn’t asserted
•First bullet
Classifying Proteins
>uniprot|Q15262|PTPK_HUMAN Receptor-type protein-tyrosine phosphatase kappa precursor
(EC 3.1.3.48) (R-PTP-kappa).
MDTTAAAALPAFVALLLLSPWPLLGSAQGQFSAGGCTFDDGPGACDYHQDLYDDFEWVHV
SAQEPHYLPPEMPQGSYMIVDSSDHDPGEKARLQLPTMKENDTHCIDFSYLLYSQKGLNP
GTLNILVRVNKGPLANPIWNVTGFTGRDWLRAELAVSSFWPNEYQVIFEAEVSGGRSGYI
AIDDIQVLSYPCDKSPHFLRLGDVEVNAGQNATFQCIATGRDAVHNKLWLQRRNGEDIPV………..
InterPro
Instance Store
Reasoner
Translate
Codify
OWL’s Automated Reasoners
• Demonstrably useful in:
– Building ontologies
– Querying ontologies
– Can automatically annotate
– Have made “discoveries”
But there is more than OWL’s reasoning
Separation of Knowledge and
Software
• We realised a long time ago that we needed
to separate
• We only recently called this knowledge
component ontology
• We don’t really need to see the ontology
• We certainly shouldn’t show people OWL; it
“scares the horses”
• Ontology for software not humans (L. Hunter)
The Ontology cottage Industry
• We’ve industrialised data production
• We’ve (to some extent) industrialised data
analysis
• We’ve not really moved away from hand-
crafted, “whittled” ontologies
Can we have Mass Editing of
Ontologies?
• Probably not;
• Computer scientists in love with synchronous
editing
• …, but not really necessary (see CSCW)
• Mass gathering of Knowledge
Mass Gathering of Knowledge and the
Application of Patterns or a
metamodel
http://rightfield.org.uk http://www.e-lico.eu/populous
There’s so much more to Ontology
Building than editing Axioms
• Gathering knowledge
• Adding labels
• Adding other human orientated content
• Reviewing, checking suggesting
• Deploying, using, creating “views”
• Ontology comprehension
There’s More to KR than OWL
• OWL and its automated reasoners are useful
• But there is so much more to KR than
ontologies and OWL
• Higher order reasoning
• Rules
• Other sorts of reasoning
Generating natural language
Class: HeLa
SubClassOf: Cell,
bearer_of some 'cervical carcinoma’,
derives_from some 'Homo sapiens’,
derives_from some cervix,
derives_from some 'epithelial cell'
OWL
HeLa is a cell line. A hela is all of the
following: something that is bearer of
a cervical carcinoma, something that
derives from a homo sapiens,
something that derives from an
epithelial cell, and something that
derives from a cervix.
Generated natural language
Experimental Factor Ontology (EFO)
http://www.ebi.ac.uk/efo
Ontology as book
Title: Experimental Factor Ontology
Table of Contents
Chapter 1. Cell line
Chapter 2. Cell type
Chapter 3. Chemical Compound
Chapter 4. Organism
HeLa is a cell line. A hela is all of the
following: something that is bearer of
a cervical carcinoma, something
that derives from a homo sapiens,
something that derives from an
epithelial cell, and something that
derives from a cervix.
entry
DataData
Types of Knowledge
Biologist’s headBiologist’s head
PapersPapers
DatabasesDatabases
OntologiesOntologies
??????
It’s not Just “Things”
• Experiments produce data about things
• Proteins, genes, chemicals, reactions,
diseases, size, shape, speed, ….
• As well as this knowledge we have knowledge
of how it was done
• OBI is still the “things” to do with production
• We still need the methods of by which these
“things” were deployed
• The protocol
Knowledge about an
experiment
Workflow
Run
Workflow
Run
Workflow
ProvenanceProvenance
Organisationa
l
Organisationa
l
Results and
Interpretation
Results and
Interpretation
Workflows are knowledge about
methods
Get genes in region
Get pathways that
contain genes
Merge data into single files
Get gene descriptions
Get pathway descriptions
Cross-reference ids
Methods:
1. A QTL (region of chromosome) is entered into the
workflow, specified as base pairs. These base pairs
are subsequently used to identify, in the Ensembl
database, any genes that lie within this region.
2. Any genes found within this region are subsequently
annotated with Entrez and UniProt identifiers.
3. The Entrez and UniProt identifiers are then passed
to a KEGG id conversion Web Service, to cross-
reference the input ids to KEGG gene identifiers.
This enables gene descriptions and biological
pathway data to be returned from KEGG.
4. Each KEGG gene id is then used in a search for
KEGG pathways. Any pathways found to contain the
gene are returned as KEGG pathway ids.
5. Both KEGG gene and pathway ids are then sent to
individual services, provided by KEGG, which
provide a description of the gene and pathway.
6. The outputs of the workflow are then combined into
single flat files, which can be saved locally and used
to identify novel pathways and genes within the QTL
region.
myExperiment
http://www.myexperiment.org
Research Objects
MethodMethod
DataData
IntroductionIntroduction
ConclusionsConclusions
ResultsResults
Human Written
WorkflowWorkflow
Generated Text
Semantically
annotated
Model, View, Controller
Annotated
Data
Annotated
Data
ControllerController
ProjectionProjection
Text
Tables Graphs
Steve Pettifer
http://utopia.cs.man.ac.uk/
What Next?
• Ontologies are not the only fruit
• We could stop calling them ontologies
• We need to produce “ontologies” faster
• We need to do more interesting things with our knowledge
• We need to make them pervade our tools
• We need then to be “agile”
• Open to other forms of KR and other forms of reasoning
• Adding to data automatically
• Generating our descriptions of data
Acknowledgements
• Simon Jupp for the slides
• Alan rector and Carole goble
• sysMoDB for rightField (Katy Wolstencroft, Stuart Owen, Matt
Horridge)
• Populous – Simon Jupp
• SWAT – richard Power, Sandra Williams and Allan third at the
OU
• EFO – James Malone and Helen Parkinson
• Steve Pettifer for the Utopia and MVC
• Paul Fisher and the Taverna team
• The myExperiment team at Southampton and Manchester

More Related Content

PPTX
Computing on the shoulders of giants
PPTX
Light Intro to the Gene Ontology
PPTX
Mining Drug Targets, Structures and Activity Data
PPTX
Ibn Sina
PPTX
Collaboratively Creating the Knowledge Graph of Life
PPTX
All together now: piecing together the knowledge graph of life
PPTX
Ontology Development Kit: Bio-Ontologies 2019
PPTX
US2TS presentation on Gene Ontology
Computing on the shoulders of giants
Light Intro to the Gene Ontology
Mining Drug Targets, Structures and Activity Data
Ibn Sina
Collaboratively Creating the Knowledge Graph of Life
All together now: piecing together the knowledge graph of life
Ontology Development Kit: Bio-Ontologies 2019
US2TS presentation on Gene Ontology

What's hot (20)

PPTX
Experiences in the biosciences with the open biological ontologies foundry an...
PDF
ECCB 2014: Extracting patterns of database and software usage from the bioinf...
PPTX
Representation of kidney structures in Uberon
PPT
Building and Using Ontologies to do biology
PPT
The Language of the Gene Ontology
PDF
PPTX
Gene Ontology WormBase Workshop International Worm Meeting 2015
PPT
Biomedical literature mining
PPTX
The Neuroscience Information Framework:The present and future of neuroscience...
DOCX
AS application
PPT
Cartic Ramakrishnan's dissertation defense
PPTX
Mungall keynote-biocurator-2017
PPTX
Chibucos annot go_final
PDF
ContentMine Presentation for WHO Health Data Seminar
PPTX
Representing and reasoning with biological knowledge
PPTX
Experiences with logic programming in bioinformatics
PPT
Bioinformatics MiRON
PDF
Protein-protein interaction networks
PPTX
2015 aem-grs-keynote
Experiences in the biosciences with the open biological ontologies foundry an...
ECCB 2014: Extracting patterns of database and software usage from the bioinf...
Representation of kidney structures in Uberon
Building and Using Ontologies to do biology
The Language of the Gene Ontology
Gene Ontology WormBase Workshop International Worm Meeting 2015
Biomedical literature mining
The Neuroscience Information Framework:The present and future of neuroscience...
AS application
Cartic Ramakrishnan's dissertation defense
Mungall keynote-biocurator-2017
Chibucos annot go_final
ContentMine Presentation for WHO Health Data Seminar
Representing and reasoning with biological knowledge
Experiences with logic programming in bioinformatics
Bioinformatics MiRON
Protein-protein interaction networks
2015 aem-grs-keynote
Ad

Viewers also liked (11)

PPT
Beyond Transparency: Success & Lessons From tambisBoston2003
PPT
Anti Ageing Treatment with stem cells
PPTX
plant stem cells
PPTX
Tissue Engineering
PPT
TISSUE ENGINEERING
PPTX
Stem Cells and Tissue Engineering: past, present and future
PPTX
Basics of Tissue engineering
PPT
Biological Presentation On Stem Cells
PPT
Stem cell
PPTX
Stem cells
PPTX
Stem cell therapy
Beyond Transparency: Success & Lessons From tambisBoston2003
Anti Ageing Treatment with stem cells
plant stem cells
Tissue Engineering
TISSUE ENGINEERING
Stem Cells and Tissue Engineering: past, present and future
Basics of Tissue engineering
Biological Presentation On Stem Cells
Stem cell
Stem cells
Stem cell therapy
Ad

Similar to The Past, Present and Future of Knowledge in Biology (20)

PPTX
Drug-discovery knowledge integration and analysis using OWL and reasoners
PPTX
Why Life is Difficult, and What We MIght Do About It
PPT
Reasoning Requirements for Bioscience
PPTX
Knowing what we’re talking about
PPTX
Web Science, SADI, and the Singularity
PPT
Ontology at Manchester
PPTX
Tutorial OWL and drug discovery ICBO 2013
PPTX
Ontologies: Necessary, but not sufficient
PPT
Introduction to Ontologies for Environmental Biology
PPTX
Introduction to the BioLink datamodel
PPT
Collaborative Ontology building: So much more than authoring an Ontology
PPT
Bio ontology drtc-seminar_anwesha
PPTX
US2TS: Reasoning over multiple open bio-ontologies to make machines and human...
PPT
Can there be such a thing as Ontology Engineering?
PPTX
Ontologies: What Librarians Need to Know
PPTX
GIGA2 Structuring Phenotype Data
PPTX
Ontology for the Financial Services Industry
PPT
Prosdocimi ucb cdao
PPTX
Web Science - ISoLA 2012
PPTX
Scaling up semantics; lessons learned across the life sciences
Drug-discovery knowledge integration and analysis using OWL and reasoners
Why Life is Difficult, and What We MIght Do About It
Reasoning Requirements for Bioscience
Knowing what we’re talking about
Web Science, SADI, and the Singularity
Ontology at Manchester
Tutorial OWL and drug discovery ICBO 2013
Ontologies: Necessary, but not sufficient
Introduction to Ontologies for Environmental Biology
Introduction to the BioLink datamodel
Collaborative Ontology building: So much more than authoring an Ontology
Bio ontology drtc-seminar_anwesha
US2TS: Reasoning over multiple open bio-ontologies to make machines and human...
Can there be such a thing as Ontology Engineering?
Ontologies: What Librarians Need to Know
GIGA2 Structuring Phenotype Data
Ontology for the Financial Services Industry
Prosdocimi ucb cdao
Web Science - ISoLA 2012
Scaling up semantics; lessons learned across the life sciences

More from robertstevens65 (20)

PPTX
The Pragmatics and Formality of Authoring OntologiesOdsl 2016
PPTX
OBOPedia: An Encyclopaedia of Biology Using OBO OntologiesObopedia swat4ls-20...
PPTX
The Quality of Method Reporting in
PPT
The Semantics of Genomic Analysis
PPTX
Issues and activities in authoring ontologies
PPTX
The state of the nation for ontology development
PPT
Properties and Individuals in OWL: Reasoning About Family History
PPT
Choosing and Building Knowledge Artefacts
PPT
Populous: A tool for Populating OWL Ontologies from Templates
PPT
Keeping ontology development Agile
PPT
Spreadsheets to OWL
PPTX
Lessons from teaching non-computer scientists OWL and ontologies
PPT
Kidney and Urinary Pathways Knowledge Base (part of e-LICO)
PPT
A Rose by Any Other Name is Still a Rose
PPT
Working with big biomedical ontologies
PPT
The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about indus...
PPT
Ontology learning from text
PPT
Knowledge Management in a Knowledge Based Discipline
PPT
A family History Knowledge Base in OWL 2
PPT
RIO: The Regularities Inspector for Ontologies Plugin for Protégé 4
The Pragmatics and Formality of Authoring OntologiesOdsl 2016
OBOPedia: An Encyclopaedia of Biology Using OBO OntologiesObopedia swat4ls-20...
The Quality of Method Reporting in
The Semantics of Genomic Analysis
Issues and activities in authoring ontologies
The state of the nation for ontology development
Properties and Individuals in OWL: Reasoning About Family History
Choosing and Building Knowledge Artefacts
Populous: A tool for Populating OWL Ontologies from Templates
Keeping ontology development Agile
Spreadsheets to OWL
Lessons from teaching non-computer scientists OWL and ontologies
Kidney and Urinary Pathways Knowledge Base (part of e-LICO)
A Rose by Any Other Name is Still a Rose
Working with big biomedical ontologies
The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about indus...
Ontology learning from text
Knowledge Management in a Knowledge Based Discipline
A family History Knowledge Base in OWL 2
RIO: The Regularities Inspector for Ontologies Plugin for Protégé 4

Recently uploaded (20)

DOCX
Q1_LE_Mathematics 8_Lesson 5_Week 5.docx
PDF
Worlds Next Door: A Candidate Giant Planet Imaged in the Habitable Zone of ↵ ...
PPTX
CORDINATION COMPOUND AND ITS APPLICATIONS
PPTX
Science Quipper for lesson in grade 8 Matatag Curriculum
PPTX
Overview of calcium in human muscles.pptx
PDF
lecture 2026 of Sjogren's syndrome l .pdf
PDF
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
PDF
Warm, water-depleted rocky exoplanets with surfaceionic liquids: A proposed c...
PDF
BET Eukaryotic signal Transduction BET Eukaryotic signal Transduction.pdf
PDF
Cosmic Outliers: Low-spin Halos Explain the Abundance, Compactness, and Redsh...
PPTX
7. General Toxicologyfor clinical phrmacy.pptx
PDF
Placing the Near-Earth Object Impact Probability in Context
PPTX
Application of enzymes in medicine (2).pptx
PDF
Lymphatic System MCQs & Practice Quiz – Functions, Organs, Nodes, Ducts
PDF
Worlds Next Door: A Candidate Giant Planet Imaged in the Habitable Zone of ↵ ...
PPTX
Hypertension_Training_materials_English_2024[1] (1).pptx
PPTX
Seminar Hypertension and Kidney diseases.pptx
PPTX
C1 cut-Methane and it's Derivatives.pptx
PPT
veterinary parasitology ````````````.ppt
PPTX
The Minerals for Earth and Life Science SHS.pptx
Q1_LE_Mathematics 8_Lesson 5_Week 5.docx
Worlds Next Door: A Candidate Giant Planet Imaged in the Habitable Zone of ↵ ...
CORDINATION COMPOUND AND ITS APPLICATIONS
Science Quipper for lesson in grade 8 Matatag Curriculum
Overview of calcium in human muscles.pptx
lecture 2026 of Sjogren's syndrome l .pdf
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
Warm, water-depleted rocky exoplanets with surfaceionic liquids: A proposed c...
BET Eukaryotic signal Transduction BET Eukaryotic signal Transduction.pdf
Cosmic Outliers: Low-spin Halos Explain the Abundance, Compactness, and Redsh...
7. General Toxicologyfor clinical phrmacy.pptx
Placing the Near-Earth Object Impact Probability in Context
Application of enzymes in medicine (2).pptx
Lymphatic System MCQs & Practice Quiz – Functions, Organs, Nodes, Ducts
Worlds Next Door: A Candidate Giant Planet Imaged in the Habitable Zone of ↵ ...
Hypertension_Training_materials_English_2024[1] (1).pptx
Seminar Hypertension and Kidney diseases.pptx
C1 cut-Methane and it's Derivatives.pptx
veterinary parasitology ````````````.ppt
The Minerals for Earth and Life Science SHS.pptx

The Past, Present and Future of Knowledge in Biology

  • 1. The Past, Present and Future of Knowledge in Biology Robert Stevens BioHealth Informatics Group The University of Manchester Manchester United Kingdom Robert.Stevens@manchester.ac.uk
  • 2. Overview • A look at the state of play • For what are we using ontologies? • What do we count as knowledge? • Doing so much more with knowledge • Stopping text being a dead end
  • 3. Text and Ontologies: The Terrible Twins of Knowledge in Biology Robert Stevens BioHealth Informatics Group The University of Manchester Manchester United Kingdom Robert.Stevens@manchester.ac.uk
  • 4. Biology now has lots of facts
  • 6. Data are only as Good as their Metadata • There is a lot of biology out there… • How these entities are described in our data varies • We don’t even agree on what entities there are to describe in our data • This makes analysing data hard: You have to know what your data represent • …, but also how the entities described in your data relate to each other • We need to describe our data – their metadata
  • 7. Creating Woods, not Trees Genes Proteins Pathways Interactions Literature Complex Machines Virtual Organism …. from biological facts, we make a system that is some model of a real organism
  • 9. There’s a Lot of it About Searching for “ontology” in five year chunks on the ACM digital portal Searching for “ontology” in five year chunks on the ACM digital portal Searching for “ontology” in five year chunks on PubMed Searching for “ontology” in five year chunks on PubMed
  • 10. It’s all Gruber’s Fault • “In the context of knowledge sharing, the term ontology means a specification of a conceptualisation. That is, an ontology is a description (like a formal specification of a program) of the concepts and relationships that can exist for an agent or a community of agents. This definition is consistent with the usage of ontology as set-of-concept-definitions, but more general. And it is certainly a different sense of the word than its use in philosophy.” DOI:10.1006/knac.1993.1008 DOI:10.1006/ijhc.1995.1081
  • 11. Angels on the head of a pin
  • 12. Everything with a Blob and Line is called an Ontology • Wide acceptance criteria • Narrow evaluation criteria • Different sort of knowledge for different situations • Different styles of representation; some scruffy and some formal • Representing knowledge in biology is more than ontologies • We could stop calling them ontologies RDF graph RDF graph Database schema Database schema ThesaurusThesaurus OWL Ontology OWL Ontology Formal ontology Formal ontology SKOS vocabulary SKOS vocabulary
  • 14. Knowing What We’ve got is so Useful • We could computationally handle lots of data, but we couldn’t do so with what we know about those data • Ontologies so far mainly used for a common tongue so that we can compare • … and it works! • Still getting lots of mileage from ontology annotation • …, But there is so much more
  • 15. GENERIC GENE ONTOLOGY (GO) TERM FINDERS000003093 MXR1 YPL250C S000004294 SAM3 YIR017C S000003152 MMP1 MET1 Expressed Genes P-value score http://go.princeton.edu/cgi-bin/GOTermFinder
  • 16. Classifying a Mouse Individual Description: Stops wriggling after 3 sec Has 3 cm tail Mass 10g 10 days old (since birth) Strain C57Bl/6 Class Description: Class:DepressedMouse EquivalentTo:Mouse that (wriggles For <=30 OR swims for <=45) DataTransformation
  • 17. Short tailed mouse Class:ShortTailedMouse EquivalentTo:Mouse that hasPart EXACTLY 1 (Tail that hasAssay SOME (LengthAssay that hasValue SOME int[<= 20) and hasUnit SOME Millimetre)) SubClassOf: Mouse that hasPart some (Tail that hasQuality SOME Short) • We can recognise an instance of short- tailed mouse, but we also know that it has the quality “short” • Even when the fact isn’t asserted •First bullet
  • 18. Classifying Proteins >uniprot|Q15262|PTPK_HUMAN Receptor-type protein-tyrosine phosphatase kappa precursor (EC 3.1.3.48) (R-PTP-kappa). MDTTAAAALPAFVALLLLSPWPLLGSAQGQFSAGGCTFDDGPGACDYHQDLYDDFEWVHV SAQEPHYLPPEMPQGSYMIVDSSDHDPGEKARLQLPTMKENDTHCIDFSYLLYSQKGLNP GTLNILVRVNKGPLANPIWNVTGFTGRDWLRAELAVSSFWPNEYQVIFEAEVSGGRSGYI AIDDIQVLSYPCDKSPHFLRLGDVEVNAGQNATFQCIATGRDAVHNKLWLQRRNGEDIPV……….. InterPro Instance Store Reasoner Translate Codify
  • 19. OWL’s Automated Reasoners • Demonstrably useful in: – Building ontologies – Querying ontologies – Can automatically annotate – Have made “discoveries” But there is more than OWL’s reasoning
  • 20. Separation of Knowledge and Software • We realised a long time ago that we needed to separate • We only recently called this knowledge component ontology • We don’t really need to see the ontology • We certainly shouldn’t show people OWL; it “scares the horses” • Ontology for software not humans (L. Hunter)
  • 21. The Ontology cottage Industry • We’ve industrialised data production • We’ve (to some extent) industrialised data analysis • We’ve not really moved away from hand- crafted, “whittled” ontologies
  • 22. Can we have Mass Editing of Ontologies? • Probably not; • Computer scientists in love with synchronous editing • …, but not really necessary (see CSCW) • Mass gathering of Knowledge
  • 23. Mass Gathering of Knowledge and the Application of Patterns or a metamodel http://rightfield.org.uk http://www.e-lico.eu/populous
  • 24. There’s so much more to Ontology Building than editing Axioms • Gathering knowledge • Adding labels • Adding other human orientated content • Reviewing, checking suggesting • Deploying, using, creating “views” • Ontology comprehension
  • 25. There’s More to KR than OWL • OWL and its automated reasoners are useful • But there is so much more to KR than ontologies and OWL • Higher order reasoning • Rules • Other sorts of reasoning
  • 26. Generating natural language Class: HeLa SubClassOf: Cell, bearer_of some 'cervical carcinoma’, derives_from some 'Homo sapiens’, derives_from some cervix, derives_from some 'epithelial cell' OWL HeLa is a cell line. A hela is all of the following: something that is bearer of a cervical carcinoma, something that derives from a homo sapiens, something that derives from an epithelial cell, and something that derives from a cervix. Generated natural language Experimental Factor Ontology (EFO) http://www.ebi.ac.uk/efo
  • 27. Ontology as book Title: Experimental Factor Ontology Table of Contents Chapter 1. Cell line Chapter 2. Cell type Chapter 3. Chemical Compound Chapter 4. Organism HeLa is a cell line. A hela is all of the following: something that is bearer of a cervical carcinoma, something that derives from a homo sapiens, something that derives from an epithelial cell, and something that derives from a cervix. entry
  • 28. DataData Types of Knowledge Biologist’s headBiologist’s head PapersPapers DatabasesDatabases OntologiesOntologies ??????
  • 29. It’s not Just “Things” • Experiments produce data about things • Proteins, genes, chemicals, reactions, diseases, size, shape, speed, …. • As well as this knowledge we have knowledge of how it was done • OBI is still the “things” to do with production • We still need the methods of by which these “things” were deployed • The protocol
  • 31. Workflows are knowledge about methods Get genes in region Get pathways that contain genes Merge data into single files Get gene descriptions Get pathway descriptions Cross-reference ids Methods: 1. A QTL (region of chromosome) is entered into the workflow, specified as base pairs. These base pairs are subsequently used to identify, in the Ensembl database, any genes that lie within this region. 2. Any genes found within this region are subsequently annotated with Entrez and UniProt identifiers. 3. The Entrez and UniProt identifiers are then passed to a KEGG id conversion Web Service, to cross- reference the input ids to KEGG gene identifiers. This enables gene descriptions and biological pathway data to be returned from KEGG. 4. Each KEGG gene id is then used in a search for KEGG pathways. Any pathways found to contain the gene are returned as KEGG pathway ids. 5. Both KEGG gene and pathway ids are then sent to individual services, provided by KEGG, which provide a description of the gene and pathway. 6. The outputs of the workflow are then combined into single flat files, which can be saved locally and used to identify novel pathways and genes within the QTL region.
  • 35. What Next? • Ontologies are not the only fruit • We could stop calling them ontologies • We need to produce “ontologies” faster • We need to do more interesting things with our knowledge • We need to make them pervade our tools • We need then to be “agile” • Open to other forms of KR and other forms of reasoning • Adding to data automatically • Generating our descriptions of data
  • 36. Acknowledgements • Simon Jupp for the slides • Alan rector and Carole goble • sysMoDB for rightField (Katy Wolstencroft, Stuart Owen, Matt Horridge) • Populous – Simon Jupp • SWAT – richard Power, Sandra Williams and Allan third at the OU • EFO – James Malone and Helen Parkinson • Steve Pettifer for the Utopia and MVC • Paul Fisher and the Taverna team • The myExperiment team at Southampton and Manchester

Editor's Notes

  • #5: Slide Title: Literature Lots of books in a library
  • #6: Slide Title: Catalogues Stack of books listing: Genome Transcriptome Proteome Interactome Metabolome Phenome
  • #8: Slide Title Slide contains: Book on the left with a plus sign Black and white image, man sat at an old valve-style computer (i.e. manchester baby) Text saying: genes, proteins, interactions, pathways Mouse on the right Text below images says: (left) Literature (middle) complex machines (right) Organism (bottom) “…. from biological facts, we make a system that is some model of a real thing” - Robert Stevens – 2008
  • #21: All of which helps build better ontologies. But can we actually apply this computational amenability more Directly to biological knowledge. In this example, which is work by Katy Wolstencroft, we have codified Community knowledge about protein domains in phosphatases in OWL. We then take unknown protein sequences, Pass then through interpro and stick them into the instance store, which is basically a database and reasoner tied together Qualified Cardiniality!!!
  • #34: Slide Title: Literature Lots of books in a library