SlideShare a Scribd company logo
Using the Semantic Web to Support Ecoinformatics Andriy Parafiynyk University of Maryland, Baltimore County http://ebiquity.umbc.edu/paper/html/id/319/Using-the-Semantic-Web-to-Support-Ecoinformatics Joint work with  Tim Finin ,  Joel Sachs, Cynthia Sims Parr, Rong Pan, Lushan Han,  Li Ding (UMBC),  Allan Hollander (UCD), David Wang (UMCP)    This research was supported by NSF ITR 0326460  and matching funds received from USGS National Biological Information Infrastructure
Invasive Species Invasive species cost the U.S. economy over $138 billion per year [1]. By various estimates, these species contribute to the decline of 35 to 46 percent of U.S. endangered and threatened species The invasive species problem is growing, as the number of pathways of invasion increases. [1]  Pimental et al. 2000 Environmental and economic costs associated with non-indigenous species in the United States. Bioscience 50:53-65. [2] Charles Groat, Director U.S. Geological Survey, http://www.usgs.gov/invasive_species/plw/usgsdirector01.html
Currently most common ways of dealing with data among biologists: Journal articles Excel spreadsheets Local databases Some information is on-line in HTML/XML
Semantic Web can offer: Ontologies  to arrive to a common vocabulary and define exactly what is what across disciplines (multiple ontologies with mappings possible) Constant on-line  data availability  with convenient ways of data acquisition and processing Data discovery  (Swoogle) Data integration  from different sources, queries on data from multiple sources Expanding the knowledge base by  inferencing  Data can be easily  updated  or  added,  users notified
OLD NEW Green: data gathering;  Pink: data integration and manipulation White: data analysis;  Blue: results dissemination Collect data OR Find data tables in literature or data registry OR Email author of data Massage data manually Write up metadata record Register dataset with data registry Start over for next project Run analyses Publish paper Post supplemental data file on web Create local spreadsheet Build automatically updating dynamic dataset Develop intelligent query for semantic web data Download to local spreadsheet Run analyses Publish paper Reanalyze using latest dataset (Query and data already publicly available)
An NSF ITR collaborative project with University of Maryland, Baltimore County  University of Maryland, College Park U. Of California, Davis Rocky Mountain Biological Laboratory
Food Webs A food web models the trophic (feeding) relationships between organisms in an ecology Food web simulators are used to explore the consequences of changes in the ecology, such as the introduction or removal of a species A locations food web is usually constructed from studies of the frequencies of the species found there and the known trophic relations among them. Goal: automatically construct a food web for a new location using existing data and knowledge ELVIS: Ecosystem Location Visualization and Information System
East River Valley Trophic Web   http://www.foodwebs.org/
Species List Constructor Click a county, get a species list
The problem We know which species exist in the location and can further restrict and fill in with other ecological models But we don’t know which of them might be eaten by a potential invasive, or which might eat the invasive We can reason from taxonomic data (similar species) and known natural history data (size, mass, habitat, etc.) to fill in the gaps.
Food Web Constructor Predict food web links using database and taxonomic reasoning. In an new estuary, Nile Tilapia could compete with ostracods (green) to eat algae. Predators (red) and prey (blue) of ostracods may be affected
Evidence Provider Examine evidence for predicted links.
ELVIS Final goal:  ELVIS   (Ecosystem Location Visualization and Information System) as an integrated set of web services for constructing food webs for a given location.
Background Ontologies SpireEcoConcepts:  confirmed and potential food web links bibliographic information of food web studies ecosystem terms taxonomic ranks California Wildlife Habitat Relationships Ontology life history geographic range management information ETHAN (Evolutionary Trees and Natural History) Concepts and properties for ‘natural history’ information on species derived from data in the Animal diversity web and other taxonomic sources
Data representation:  ETHAN Ontology ethan_animals.owl: phylogenetic information about organisms ethan_keywords.owl: geographic range, habitats, physical description, trophic information, reproduction, lifespan, behavioral information, conservation Status Information in triples:  “ Esox lucius”  is a subclass of “Esox” “ Esox lucius” has max mass  “1.4 kg” “ Esox” eats “Actinopterygii”
Using ETHAN and OWL inferencing  to predict success of invasive species Known food web links: rabbit eats carrot What about hare? Yes with high probability since both are subclasses of the same class in taxonomic hierarchy, have same habitat etc yummy!!! yummy???
http://swoogle.umbc.edu/ Running since summer 2004 1.8M RDF docs, 320M triples, 10K ontologies, 15K namespaces, 1.3M classes, 175K properties, 43M instances, 600 registered users
Applications and use cases Supporting Semantic Web developers Ontology designers, vocabulary discovery, who’s using my ontologies or data?, use analysis, errors, statistics, etc. Searching specialized collections Spire: aggregating observations and data from biologists InferenceWeb: searching over and enhancing proofs SemNews: Text Meaning of news stories Supporting Semantic Web tools Triple shop: finding data for SPARQL queries 1 2 3
Search for ontologies which contain this terms 1
746 ontologies were found that had these two terms By default, ontologies are ordered by their ‘popularity’, but they can also be ordered by date or size.
We can also search for any  RDF documents containing these terms
5,378 documents were found that had these two terms
UMBC Triple Shop http://sparql.cs.umbc.edu/tripleshop2/ Finding datasets   in the absence of the FROM clause Constraints  by URI domain or namespace (more coming) Reasoning  (none/rdfs/owl) Dataset persistence : queries and results can be saved, tagged, annotated, shared, searched for, etc. 3 2
What are body masses of fishes that eat fishes? . . . leaving out the FROM clause Swoogle Triple Shop
specify dataset
RDF documents were found that might have useful data
We’ll select them all and add them to the current dataset.
We’ll run the query against this dataset to see if the results are as expected.
The results can be produced in any of several formats
Results http://sparql.cs.umbc.edu/tripleshop2/
Looks like a useful dataset.  Let’s save it and also materialize it the TS triple store.
Contributions OWL ontologies for ecoinformatics domain data representation data sharing inferencing OWL data discovery Ability to automatically construct datasets relevant to the query Dataset storage/sharing

More Related Content

PPTX
Highly dimensional data_20160926
PPTX
WikiGenomes Poster (ISMB)
PDF
Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...
PPTX
Model Organism Linked Data
PDF
The CEDAR Workbench: An Ontology-Assisted Environment for Authoring Metadata ...
PDF
Embracing Semantic Technology for Better Metadata Authoring in Biomedicine (S...
PPTX
Presentation from Code Camp 2017
PPTX
Behavior ontology workshop princeton
Highly dimensional data_20160926
WikiGenomes Poster (ISMB)
Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...
Model Organism Linked Data
The CEDAR Workbench: An Ontology-Assisted Environment for Authoring Metadata ...
Embracing Semantic Technology for Better Metadata Authoring in Biomedicine (S...
Presentation from Code Camp 2017
Behavior ontology workshop princeton

What's hot (20)

PDF
An Open Repository Model for Acquiring Knowledge About Scientific Experiments
PDF
FISHLink Presentation at JISC MRD Workshop
PPTX
Proteomics resources at the EBI & ExPASy
PPT
Bio 1B Library Skills Lecture
PPTX
DAS game: how a programmer thinks
PPTX
A Provenance assisted Roadmap for Life Sciences Linked Open Data Cloud
PDF
Preprints: a journey though time
PPT
Bioinformatics Databases
PDF
ICBO2017 - Supporting Ontology-Based Standardization of Biomedical Metadata i...
PPTX
Kegg database resources
PPTX
Copyright basics and update 5 9 2013
PDF
phylosmith
PPTX
2014 ASPB Presentation- Berardini
PPT
PPT
Keeping Current with Scholarly Literature
PPTX
FAIR Agronomy, where are we? The KnetMiner Use Case
PPT
Examples of ontology applications
PPT
Examples of Ontology Applications
PPTX
ATBI Mapping Program: Species Distribution Models for Great Smoky Mountains N...
PPT
Dr Sarah Adamowicz - Ecological studies
An Open Repository Model for Acquiring Knowledge About Scientific Experiments
FISHLink Presentation at JISC MRD Workshop
Proteomics resources at the EBI & ExPASy
Bio 1B Library Skills Lecture
DAS game: how a programmer thinks
A Provenance assisted Roadmap for Life Sciences Linked Open Data Cloud
Preprints: a journey though time
Bioinformatics Databases
ICBO2017 - Supporting Ontology-Based Standardization of Biomedical Metadata i...
Kegg database resources
Copyright basics and update 5 9 2013
phylosmith
2014 ASPB Presentation- Berardini
Keeping Current with Scholarly Literature
FAIR Agronomy, where are we? The KnetMiner Use Case
Examples of ontology applications
Examples of Ontology Applications
ATBI Mapping Program: Species Distribution Models for Great Smoky Mountains N...
Dr Sarah Adamowicz - Ecological studies
Ad

Viewers also liked (19)

PPT
PennTags presentation at Educause 2006
PPT
Klinisch-Therapeutische Studien: EBM DIM
PPS
Juan Pablo II
PPT
RIA - RDA - ROA
PPS
Los 12 mas malos
PPT
Writing in Cyberspace
PPTX
Wwsss intro2016-final
PPS
Iquique - Chile, tiene de todo
PDF
Power Of 30 Seconds: Best Practices for Exceptional Support
PDF
Phoenix for Rubyists - Rubyconf Brazil 2016
PPT
New Battleground for Customer Loyalty
PPT
Economía Blog
PPT
On Physical Web models
PDF
Model-based Testing Principles
PPTX
BPMN and Design Patterns for Engineering Social BPM Solutions
PDF
Semantic Faceted Search with SemFacet presentation
PPT
Mass Customisation In Scm
PDF
The Challenger Customer
PDF
An Introduction to The Challenger Customer [Pat Spenner, CEB]
PennTags presentation at Educause 2006
Klinisch-Therapeutische Studien: EBM DIM
Juan Pablo II
RIA - RDA - ROA
Los 12 mas malos
Writing in Cyberspace
Wwsss intro2016-final
Iquique - Chile, tiene de todo
Power Of 30 Seconds: Best Practices for Exceptional Support
Phoenix for Rubyists - Rubyconf Brazil 2016
New Battleground for Customer Loyalty
Economía Blog
On Physical Web models
Model-based Testing Principles
BPMN and Design Patterns for Engineering Social BPM Solutions
Semantic Faceted Search with SemFacet presentation
Mass Customisation In Scm
The Challenger Customer
An Introduction to The Challenger Customer [Pat Spenner, CEB]
Ad

Similar to Using the Semantic Web to Support Ecoinformatics (20)

PPTX
The emerging biodiversity data ecosystem
PPTX
How the Encyclopedia of Life is wrangling organismal attribute data
PPT
Shorthouse
PDF
PENSOFT ARTICLE COLLECTION ABOUT MYANMAR
PPT
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
PPTX
RPG iEvoBio 2010 Keynote
PPTX
iEvoBio Keynote Talk 2010
PDF
Microbial Phylogenomics (EVE161) Class 5
PPTX
Big data nebraska
DOC
ABIcurator.doc
PPT
Building a Community Cyberinfrastructure to Support Marine Microbial Ecology ...
PPTX
Biodiversity Informatics: An Interdisciplinary Challenge
PPT
The Encyclopedia of Life: How realistic is it?
PPT
Microbial Metagenomics Drives a New Cyberinfrastructure
PDF
Bioinformatics databases: Current Trends and Future Perspectives
PPT
Finding knowledge, data and answers on the Semantic Web
DOCX
PPTX
Big Data Field Museum
PDF
Annotating The Biomedical Literature For The Human Variome
PPTX
Big data nebraska
The emerging biodiversity data ecosystem
How the Encyclopedia of Life is wrangling organismal attribute data
Shorthouse
PENSOFT ARTICLE COLLECTION ABOUT MYANMAR
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
RPG iEvoBio 2010 Keynote
iEvoBio Keynote Talk 2010
Microbial Phylogenomics (EVE161) Class 5
Big data nebraska
ABIcurator.doc
Building a Community Cyberinfrastructure to Support Marine Microbial Ecology ...
Biodiversity Informatics: An Interdisciplinary Challenge
The Encyclopedia of Life: How realistic is it?
Microbial Metagenomics Drives a New Cyberinfrastructure
Bioinformatics databases: Current Trends and Future Perspectives
Finding knowledge, data and answers on the Semantic Web
Big Data Field Museum
Annotating The Biomedical Literature For The Human Variome
Big data nebraska

Recently uploaded (20)

PPT
Teaching material agriculture food technology
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Encapsulation theory and applications.pdf
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
A comparative analysis of optical character recognition models for extracting...
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
Teaching material agriculture food technology
Programs and apps: productivity, graphics, security and other tools
Building Integrated photovoltaic BIPV_UPV.pdf
Encapsulation theory and applications.pdf
Spectral efficient network and resource selection model in 5G networks
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
A comparative analysis of optical character recognition models for extracting...
Per capita expenditure prediction using model stacking based on satellite ima...
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Assigned Numbers - 2025 - Bluetooth® Document
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Encapsulation_ Review paper, used for researhc scholars
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
The AUB Centre for AI in Media Proposal.docx
Reach Out and Touch Someone: Haptics and Empathic Computing
Digital-Transformation-Roadmap-for-Companies.pptx
Diabetes mellitus diagnosis method based random forest with bat algorithm
20250228 LYD VKU AI Blended-Learning.pptx

Using the Semantic Web to Support Ecoinformatics

  • 1. Using the Semantic Web to Support Ecoinformatics Andriy Parafiynyk University of Maryland, Baltimore County http://ebiquity.umbc.edu/paper/html/id/319/Using-the-Semantic-Web-to-Support-Ecoinformatics Joint work with Tim Finin , Joel Sachs, Cynthia Sims Parr, Rong Pan, Lushan Han, Li Ding (UMBC), Allan Hollander (UCD), David Wang (UMCP)  This research was supported by NSF ITR 0326460 and matching funds received from USGS National Biological Information Infrastructure
  • 2. Invasive Species Invasive species cost the U.S. economy over $138 billion per year [1]. By various estimates, these species contribute to the decline of 35 to 46 percent of U.S. endangered and threatened species The invasive species problem is growing, as the number of pathways of invasion increases. [1] Pimental et al. 2000 Environmental and economic costs associated with non-indigenous species in the United States. Bioscience 50:53-65. [2] Charles Groat, Director U.S. Geological Survey, http://www.usgs.gov/invasive_species/plw/usgsdirector01.html
  • 3. Currently most common ways of dealing with data among biologists: Journal articles Excel spreadsheets Local databases Some information is on-line in HTML/XML
  • 4. Semantic Web can offer: Ontologies to arrive to a common vocabulary and define exactly what is what across disciplines (multiple ontologies with mappings possible) Constant on-line data availability with convenient ways of data acquisition and processing Data discovery (Swoogle) Data integration from different sources, queries on data from multiple sources Expanding the knowledge base by inferencing Data can be easily updated or added, users notified
  • 5. OLD NEW Green: data gathering; Pink: data integration and manipulation White: data analysis; Blue: results dissemination Collect data OR Find data tables in literature or data registry OR Email author of data Massage data manually Write up metadata record Register dataset with data registry Start over for next project Run analyses Publish paper Post supplemental data file on web Create local spreadsheet Build automatically updating dynamic dataset Develop intelligent query for semantic web data Download to local spreadsheet Run analyses Publish paper Reanalyze using latest dataset (Query and data already publicly available)
  • 6. An NSF ITR collaborative project with University of Maryland, Baltimore County University of Maryland, College Park U. Of California, Davis Rocky Mountain Biological Laboratory
  • 7. Food Webs A food web models the trophic (feeding) relationships between organisms in an ecology Food web simulators are used to explore the consequences of changes in the ecology, such as the introduction or removal of a species A locations food web is usually constructed from studies of the frequencies of the species found there and the known trophic relations among them. Goal: automatically construct a food web for a new location using existing data and knowledge ELVIS: Ecosystem Location Visualization and Information System
  • 8. East River Valley Trophic Web http://www.foodwebs.org/
  • 9. Species List Constructor Click a county, get a species list
  • 10. The problem We know which species exist in the location and can further restrict and fill in with other ecological models But we don’t know which of them might be eaten by a potential invasive, or which might eat the invasive We can reason from taxonomic data (similar species) and known natural history data (size, mass, habitat, etc.) to fill in the gaps.
  • 11. Food Web Constructor Predict food web links using database and taxonomic reasoning. In an new estuary, Nile Tilapia could compete with ostracods (green) to eat algae. Predators (red) and prey (blue) of ostracods may be affected
  • 12. Evidence Provider Examine evidence for predicted links.
  • 13. ELVIS Final goal: ELVIS (Ecosystem Location Visualization and Information System) as an integrated set of web services for constructing food webs for a given location.
  • 14. Background Ontologies SpireEcoConcepts: confirmed and potential food web links bibliographic information of food web studies ecosystem terms taxonomic ranks California Wildlife Habitat Relationships Ontology life history geographic range management information ETHAN (Evolutionary Trees and Natural History) Concepts and properties for ‘natural history’ information on species derived from data in the Animal diversity web and other taxonomic sources
  • 15. Data representation: ETHAN Ontology ethan_animals.owl: phylogenetic information about organisms ethan_keywords.owl: geographic range, habitats, physical description, trophic information, reproduction, lifespan, behavioral information, conservation Status Information in triples: “ Esox lucius” is a subclass of “Esox” “ Esox lucius” has max mass “1.4 kg” “ Esox” eats “Actinopterygii”
  • 16. Using ETHAN and OWL inferencing to predict success of invasive species Known food web links: rabbit eats carrot What about hare? Yes with high probability since both are subclasses of the same class in taxonomic hierarchy, have same habitat etc yummy!!! yummy???
  • 17. http://swoogle.umbc.edu/ Running since summer 2004 1.8M RDF docs, 320M triples, 10K ontologies, 15K namespaces, 1.3M classes, 175K properties, 43M instances, 600 registered users
  • 18. Applications and use cases Supporting Semantic Web developers Ontology designers, vocabulary discovery, who’s using my ontologies or data?, use analysis, errors, statistics, etc. Searching specialized collections Spire: aggregating observations and data from biologists InferenceWeb: searching over and enhancing proofs SemNews: Text Meaning of news stories Supporting Semantic Web tools Triple shop: finding data for SPARQL queries 1 2 3
  • 19. Search for ontologies which contain this terms 1
  • 20. 746 ontologies were found that had these two terms By default, ontologies are ordered by their ‘popularity’, but they can also be ordered by date or size.
  • 21. We can also search for any RDF documents containing these terms
  • 22. 5,378 documents were found that had these two terms
  • 23. UMBC Triple Shop http://sparql.cs.umbc.edu/tripleshop2/ Finding datasets in the absence of the FROM clause Constraints by URI domain or namespace (more coming) Reasoning (none/rdfs/owl) Dataset persistence : queries and results can be saved, tagged, annotated, shared, searched for, etc. 3 2
  • 24. What are body masses of fishes that eat fishes? . . . leaving out the FROM clause Swoogle Triple Shop
  • 26. RDF documents were found that might have useful data
  • 27. We’ll select them all and add them to the current dataset.
  • 28. We’ll run the query against this dataset to see if the results are as expected.
  • 29. The results can be produced in any of several formats
  • 31. Looks like a useful dataset. Let’s save it and also materialize it the TS triple store.
  • 32. Contributions OWL ontologies for ecoinformatics domain data representation data sharing inferencing OWL data discovery Ability to automatically construct datasets relevant to the query Dataset storage/sharing