SlideShare a Scribd company logo
The Semantics of Genomic 
Analysis 
Robert Stevens 
BioHealth Informatics Group 
School of Computer Science 
University of Manchester 
robert.stevens@manchester.ac.uk
The Collaboration 
“Developing a GRID-based system for 
integrating and exploring data from 
comparative genomics, to discover 
biological knowledge that can not be 
discovered from any one source” 
Collaborative BBSRC project 
– 5 sites across the UK 
http:// www.comparagrid.org
3 
Introduction 
• The general problem 
• An architecture for getting answers 
• Forming the questions and making 
distinctions 
• Some observations
Sleeping Cows: Unknown Genes 
4 
??? 
Cow Infected with African 
Trypanosomiasis 
Cow chromosome and 
known Genes
Sleeping Mouse Model with QTL 
5 
Mouse Infected with African 
Trypanosomiasis 
Mouse QTL region on 
chromosome, encompassing 
multiple genes
We can Find Out About Mouse Resistance 
6 
Mouse Infected with African Trypanosomiasis 
Mouse QTL region on chromosome, 
encompassing multiple genes 
Genes A, B, C, D in QTL are 
involved in Trypanosomiasis 
resistance
Use Mouse to Tell us about Cow 
Infer presence and 
order of Cow 
genes from 
presence and order 
of Mouse genes
8 
Requirements 
• Ask for maps that that contain marker from 
species x in species y, z, etc 
• What chromosome or part of chromosome 
does a map model? 
• What is the gene order in species x and 
species y? 
• Also notions of similarity, homology, and, 
of course, synteny 
• Orthology and paralogy
The Fluxion Stack 
Raw data 
Raw 
data 
Syntax Semantics Aggregation 
Pub 
svc 
Trans 
svc 
integrator 
query 
data
10 
Query Semantics 
• Query OWL class interpreted as 
– K query – (poor man’s) epistemic closure of query 
– Against knowledge-base exposed by that data-source, not 
The World 
• Result is a knowledge-base 
– All entailed by queried KB (it’s a subset) 
– Can be statements in the original KB, or any statements that 
always follow 
• Contains at least the statements needed to 
– Allow a reasoner to classify all the individuals who match the 
query correctly 
– Preferably using properties, not asserted types (a-box 
preferred over t-box,don’t over-commit) 
– Speed for accuracy 
– Implementation complexity for data-volume 
– Return all instances of known classes e.g. db table with 
minimal filtering – if in doubt, return it
Role of Ontology in Fluxion 
• Fluxion 
– Uses semantics of OWL 
– Not any domain-specific information 
– Any domain 
• A domain ontology defines what Fluxion integrates 
• Developing a ‘good’ domain ontology is 
– Hard work 
– Poorly scoped 
– No widely-validated methodology 
– Biologist ¹ Modeller so language gap
ComparaGRID Ontology Scope 
12 
• Genetics 
– Markers and Maps 
• Genomics 
– Genomes and Sequences 
• Comparative Aspects 
– Evolutionary relationships, 
Similarities 
• Physical entities 
– Chromosome, Organism
Models and Representations 
• We ‘know’ things about physical biological things 
– We know other things about the way we render information 
about these things 
– What reconciles a map and a sequence of a thing is the thing 
itself 
– A “knowledge anchor” 
has representation 
is representation of 
13 
Physical Biological 
has model 
is model of 
Maps Entities Sequences
14 
Modelling Maps 
• A map is an abstract model of a physical thing 
– Good for ordering limited knowledge 
– Minimal explicit biology in a map, just implicit biology in the 
labelling of things 
– Maps can be modelled as a combination of lines and blobs: 
0 A B C D E 
1) Line: model-part 
2) Blobs: model-parts 
3) Map: Model of ordering of blobs 
within line
15 
Modelling ‘Markers’ 
• ‘Marker’ is overloaded term 
• Meaning here is captured by linking lines and blobs 
up to physical things 
0 AB C D E 
Map 
is model of 
(ordering of 
things on) 
chromosome 
or region 
AB C D E 
‘Line’ 
is model of 
chromosome 
or region 
‘Blob’ 
is model of 
(detectable) 
region
Strategy: Domain and Application Ontologies 
16 
Upper classes 
Domain classes 
Derives 
Classes used by 
data model(s) 
Datatypes 
Informs
TAMBIS AllOver Again 
• …, but now we have different 
technologies; bio-ontologies galore; 
different resources. 
• Mainly SQL queries over RDBMS 
• Can actually answer the questions we ask 
• Rather than mapping down to the 
resources; we map down from the global 
model and up from the resources into an 
OWL world 
17
18 
Main Points 
• Its hard: Sophisticated ontologies don’t 
appear by magic 
• Open world good, open world bad 
• Need to do local closure: A poor man’s 
epistemic operator… 
• It isn’t completely correct 
• A sophisticated ontology in a SW application 
• Highly axiomatised ontologies for the 
Semantic Web?

More Related Content

PPTX
GLOBE Metadata Analysis
PPT
Importing life science at a into Neo4j
PPT
Ontology learning from text
PPT
Issues in Learning an Ontology from Text
PPT
Loughborough research forum 2010 data overload presentation
PPT
NJVR: The NanJing Vocabulary Repository
PPTX
OBOPedia: An Encyclopaedia of Biology Using OBO OntologiesObopedia swat4ls-20...
PPT
Building and Using Ontologies to do biology
GLOBE Metadata Analysis
Importing life science at a into Neo4j
Ontology learning from text
Issues in Learning an Ontology from Text
Loughborough research forum 2010 data overload presentation
NJVR: The NanJing Vocabulary Repository
OBOPedia: An Encyclopaedia of Biology Using OBO OntologiesObopedia swat4ls-20...
Building and Using Ontologies to do biology

Viewers also liked (6)

PPTX
The Quality of Method Reporting in
PPTX
Issues and activities in authoring ontologies
PPTX
The Pragmatics and Formality of Authoring OntologiesOdsl 2016
PPTX
The state of the nation for ontology development
PPT
Properties and Individuals in OWL: Reasoning About Family History
PDF
Semtech web-protege-tutorial
The Quality of Method Reporting in
Issues and activities in authoring ontologies
The Pragmatics and Formality of Authoring OntologiesOdsl 2016
The state of the nation for ontology development
Properties and Individuals in OWL: Reasoning About Family History
Semtech web-protege-tutorial
Ad

Similar to The Semantics of Genomic Analysis (20)

PPT
Semantics for Bioinformatics: What, Why and How of Search, Integration and An...
PPT
Ontology at Manchester
PPTX
Knowing what we’re talking about
PPT
Communities building ontologies: Tensions and Reality
PPT
Can there be such a thing as Ontology Engineering?
PDF
Mapping Lo Dto Proton Revised [Compatibility Mode]
PPTX
Ontologies: Necessary, but not sufficient
PPTX
Phyloinformatics and the Semantic Web
PDF
Building and using ontologies
PDF
Tutorial: Building and using ontologies - E.Simperl - ESWC SS 2014
PPTX
Building (and traveling) the data-brick road: A report from the front lines ...
PDF
Ontologies Fmi 042010
PPTX
The Semantic Web - This time... its Personal
PDF
Ontology Based Information Extraction for Disease Intelligence
PPTX
Semantics based Summarization of Entities in Knowledge Graphs
PPT
Prosdocimi ucb cdao
PPTX
‘Smart’ Taxonomy- & Ontology- Enabled Resources for Taxonomy Bootcamp
PDF
Semantic Interoperability - grafi della conoscenza
PDF
ESWC SS 2013 - Wednesday Tutorial Elena Simperl: Creating and Using Ontologie...
Semantics for Bioinformatics: What, Why and How of Search, Integration and An...
Ontology at Manchester
Knowing what we’re talking about
Communities building ontologies: Tensions and Reality
Can there be such a thing as Ontology Engineering?
Mapping Lo Dto Proton Revised [Compatibility Mode]
Ontologies: Necessary, but not sufficient
Phyloinformatics and the Semantic Web
Building and using ontologies
Tutorial: Building and using ontologies - E.Simperl - ESWC SS 2014
Building (and traveling) the data-brick road: A report from the front lines ...
Ontologies Fmi 042010
The Semantic Web - This time... its Personal
Ontology Based Information Extraction for Disease Intelligence
Semantics based Summarization of Entities in Knowledge Graphs
Prosdocimi ucb cdao
‘Smart’ Taxonomy- & Ontology- Enabled Resources for Taxonomy Bootcamp
Semantic Interoperability - grafi della conoscenza
ESWC SS 2013 - Wednesday Tutorial Elena Simperl: Creating and Using Ontologie...
Ad

More from robertstevens65 (16)

PPT
Choosing and Building Knowledge Artefacts
PPT
Populous: A tool for Populating OWL Ontologies from Templates
PPT
Keeping ontology development Agile
PPT
Spreadsheets to OWL
PPTX
Lessons from teaching non-computer scientists OWL and ontologies
PPT
Kidney and Urinary Pathways Knowledge Base (part of e-LICO)
PPT
A Rose by Any Other Name is Still a Rose
PPT
Working with big biomedical ontologies
PPT
The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about indus...
PPT
Knowledge Management in a Knowledge Based Discipline
PPT
A family History Knowledge Base in OWL 2
PPT
RIO: The Regularities Inspector for Ontologies Plugin for Protégé 4
PPT
Making Semantics do Some Work
PPT
The Past, Present and Future of Knowledge in Biology
PPT
Could Mendelev have Dreamt in OWL?
PPT
Using Ontology to Classify Members of a Protein Family
Choosing and Building Knowledge Artefacts
Populous: A tool for Populating OWL Ontologies from Templates
Keeping ontology development Agile
Spreadsheets to OWL
Lessons from teaching non-computer scientists OWL and ontologies
Kidney and Urinary Pathways Knowledge Base (part of e-LICO)
A Rose by Any Other Name is Still a Rose
Working with big biomedical ontologies
The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about indus...
Knowledge Management in a Knowledge Based Discipline
A family History Knowledge Base in OWL 2
RIO: The Regularities Inspector for Ontologies Plugin for Protégé 4
Making Semantics do Some Work
The Past, Present and Future of Knowledge in Biology
Could Mendelev have Dreamt in OWL?
Using Ontology to Classify Members of a Protein Family

Recently uploaded (20)

PPTX
Protein & Amino Acid Structures Levels of protein structure (primary, seconda...
PPTX
2Systematics of Living Organisms t-.pptx
PPTX
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
PPTX
Introduction to Fisheries Biotechnology_Lesson 1.pptx
PDF
IFIT3 RNA-binding activity primores influenza A viruz infection and translati...
PDF
Phytochemical Investigation of Miliusa longipes.pdf
PPTX
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
PDF
An interstellar mission to test astrophysical black holes
PPTX
Cell Membrane: Structure, Composition & Functions
PPTX
7. General Toxicologyfor clinical phrmacy.pptx
PPTX
Taita Taveta Laboratory Technician Workshop Presentation.pptx
PDF
Biophysics 2.pdffffffffffffffffffffffffff
PPTX
Classification Systems_TAXONOMY_SCIENCE8.pptx
PPTX
Introduction to Cardiovascular system_structure and functions-1
DOCX
Q1_LE_Mathematics 8_Lesson 5_Week 5.docx
PPT
protein biochemistry.ppt for university classes
PPTX
Microbiology with diagram medical studies .pptx
PPTX
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
PDF
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
PDF
Placing the Near-Earth Object Impact Probability in Context
Protein & Amino Acid Structures Levels of protein structure (primary, seconda...
2Systematics of Living Organisms t-.pptx
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
Introduction to Fisheries Biotechnology_Lesson 1.pptx
IFIT3 RNA-binding activity primores influenza A viruz infection and translati...
Phytochemical Investigation of Miliusa longipes.pdf
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
An interstellar mission to test astrophysical black holes
Cell Membrane: Structure, Composition & Functions
7. General Toxicologyfor clinical phrmacy.pptx
Taita Taveta Laboratory Technician Workshop Presentation.pptx
Biophysics 2.pdffffffffffffffffffffffffff
Classification Systems_TAXONOMY_SCIENCE8.pptx
Introduction to Cardiovascular system_structure and functions-1
Q1_LE_Mathematics 8_Lesson 5_Week 5.docx
protein biochemistry.ppt for university classes
Microbiology with diagram medical studies .pptx
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
Placing the Near-Earth Object Impact Probability in Context

The Semantics of Genomic Analysis

  • 1. The Semantics of Genomic Analysis Robert Stevens BioHealth Informatics Group School of Computer Science University of Manchester robert.stevens@manchester.ac.uk
  • 2. The Collaboration “Developing a GRID-based system for integrating and exploring data from comparative genomics, to discover biological knowledge that can not be discovered from any one source” Collaborative BBSRC project – 5 sites across the UK http:// www.comparagrid.org
  • 3. 3 Introduction • The general problem • An architecture for getting answers • Forming the questions and making distinctions • Some observations
  • 4. Sleeping Cows: Unknown Genes 4 ??? Cow Infected with African Trypanosomiasis Cow chromosome and known Genes
  • 5. Sleeping Mouse Model with QTL 5 Mouse Infected with African Trypanosomiasis Mouse QTL region on chromosome, encompassing multiple genes
  • 6. We can Find Out About Mouse Resistance 6 Mouse Infected with African Trypanosomiasis Mouse QTL region on chromosome, encompassing multiple genes Genes A, B, C, D in QTL are involved in Trypanosomiasis resistance
  • 7. Use Mouse to Tell us about Cow Infer presence and order of Cow genes from presence and order of Mouse genes
  • 8. 8 Requirements • Ask for maps that that contain marker from species x in species y, z, etc • What chromosome or part of chromosome does a map model? • What is the gene order in species x and species y? • Also notions of similarity, homology, and, of course, synteny • Orthology and paralogy
  • 9. The Fluxion Stack Raw data Raw data Syntax Semantics Aggregation Pub svc Trans svc integrator query data
  • 10. 10 Query Semantics • Query OWL class interpreted as – K query – (poor man’s) epistemic closure of query – Against knowledge-base exposed by that data-source, not The World • Result is a knowledge-base – All entailed by queried KB (it’s a subset) – Can be statements in the original KB, or any statements that always follow • Contains at least the statements needed to – Allow a reasoner to classify all the individuals who match the query correctly – Preferably using properties, not asserted types (a-box preferred over t-box,don’t over-commit) – Speed for accuracy – Implementation complexity for data-volume – Return all instances of known classes e.g. db table with minimal filtering – if in doubt, return it
  • 11. Role of Ontology in Fluxion • Fluxion – Uses semantics of OWL – Not any domain-specific information – Any domain • A domain ontology defines what Fluxion integrates • Developing a ‘good’ domain ontology is – Hard work – Poorly scoped – No widely-validated methodology – Biologist ¹ Modeller so language gap
  • 12. ComparaGRID Ontology Scope 12 • Genetics – Markers and Maps • Genomics – Genomes and Sequences • Comparative Aspects – Evolutionary relationships, Similarities • Physical entities – Chromosome, Organism
  • 13. Models and Representations • We ‘know’ things about physical biological things – We know other things about the way we render information about these things – What reconciles a map and a sequence of a thing is the thing itself – A “knowledge anchor” has representation is representation of 13 Physical Biological has model is model of Maps Entities Sequences
  • 14. 14 Modelling Maps • A map is an abstract model of a physical thing – Good for ordering limited knowledge – Minimal explicit biology in a map, just implicit biology in the labelling of things – Maps can be modelled as a combination of lines and blobs: 0 A B C D E 1) Line: model-part 2) Blobs: model-parts 3) Map: Model of ordering of blobs within line
  • 15. 15 Modelling ‘Markers’ • ‘Marker’ is overloaded term • Meaning here is captured by linking lines and blobs up to physical things 0 AB C D E Map is model of (ordering of things on) chromosome or region AB C D E ‘Line’ is model of chromosome or region ‘Blob’ is model of (detectable) region
  • 16. Strategy: Domain and Application Ontologies 16 Upper classes Domain classes Derives Classes used by data model(s) Datatypes Informs
  • 17. TAMBIS AllOver Again • …, but now we have different technologies; bio-ontologies galore; different resources. • Mainly SQL queries over RDBMS • Can actually answer the questions we ask • Rather than mapping down to the resources; we map down from the global model and up from the resources into an OWL world 17
  • 18. 18 Main Points • Its hard: Sophisticated ontologies don’t appear by magic • Open world good, open world bad • Need to do local closure: A poor man’s epistemic operator… • It isn’t completely correct • A sophisticated ontology in a SW application • Highly axiomatised ontologies for the Semantic Web?

Editor's Notes

  • #3: <number>
  • #5: Slide 1 Sick cow on left Arrow in middle Chromosome on right Question marks above arrow Text says: (left) Cow Infected with African Trypanosomiasis (right) Cow chromosome and known Genes
  • #6: Slide 2 Sleeping mouse on left Arrow in middle Chromosome on right with QTL and multiple genes highlighted Text says: (left) Mouse Infected with African Trypanosomiasis (right) Mouse QTL region on chromosome, encompassing multiple genes
  • #7: Slide 3 Sleeping mouse on left Arrow in middle Chromosome on right with QTL and multiple genes highlighted Text says: (left) Mouse Infected with African Trypanosomiasis (small text) (right) Mouse QTL region on chromosome, encompassing multiple genes (small text) (centre) Genes A, B, C, D in QTL are involved in Trypanosomiasis resistance (large text)
  • #8: Slide 4 Sleeping mouse on left below mouse chromosome Arrow in middle Sick cow on right below cow chromosome Text says: (centre) Infer presence and order of Cow genes from presence and order of Mouse genes
  • #10: <number>
  • #12: <number>