SlideShare a Scribd company logo
Using the Biological Collections 
Ontology to Advance Biodiversity 
Science 
TDWG 2014, Jönköping, Sweden 
Ramona Walls 
John Wieczorek 
Robert Guralnick 
John Deck
Overview 
1. How we model biodiversity information in 
the Biological Collections Ontology 
2. Integrating ontologies into biodiversity 
information workflows
Properties in an example Darwin 
Core record 
• occurrenceID 
• modified 
• rights 
• institutionCode 
• collectionCode 
• datasetName 
• basisOfRecord 
• dynamicProperty 
• catalogNumber 
• recordedBy 
• sex 
• preparations 
• otherCatalogNumbers 
• associatedMedia 
• associatedReferences 
• associatedSequences 
• eventDate 
• year 
• month 
• day 
• fieldNumber 
• eventRemarks 
• higherGeography 
• continent 
• waterBody 
• islandGroup 
• island 
• country 
• stateProvince 
• county 
• locality 
• minimumDepthInMeters 
• maximumDepthInMeters 
• locationRemarks 
• decimalLatitude 
• decimalLongitude 
• geodeticDatum 
• coordinateUncertaintyIn 
Meters 
• georeferencedBy 
• georeferencedDate 
• georeferenceSources 
• georeferenceRemarks 
• identifiedBy 
• dateIdentified 
• typeStatus 
• scientificName 
• kingdom 
• phylum 
• class 
• order 
• family 
• genus 
• specificEpithet 
• infraspecificEpithet 
• scientificNameAuthorship
Properties in an example Darwin 
Core record 
• occurrenceID 
• modified 
• rights 
• institutionCode 
• collectionCode 
• datasetName 
• basisOfRecord 
• dynamicProperty 
• catalogNumber 
• recordedBy 
• sex 
• preparations 
• otherCatalogNumbers 
• associatedMedia 
• associatedReferences 
• associatedSequences 
• eventDate 
• year 
• month 
• day 
• fieldNumber 
• eventRemarks 
• higherGeography 
• continent 
• waterBody 
• islandGroup 
• island 
• country 
• stateProvince 
• county 
• locality 
• minimumDepthInMeters 
• maximumDepthInMeters 
• locationRemarks 
• decimalLatitude 
• decimalLongitude 
• geodeticDatum 
• coordinateUncertaintyIn 
Meters 
• georeferencedBy 
• georeferencedDate 
• georeferenceSources 
• georeferenceRemarks 
• identifiedBy 
• dateIdentified 
• typeStatus 
• scientificName 
• kingdom 
• phylum 
• class 
• order 
• family 
• genus 
• specificEpithet 
• infraspecificEpithet 
• scientificNameAuthorship 
RECORD
Properties in an example Darwin 
Core record 
• occurrenceID 
• modified 
• rights 
• institutionCode 
• collectionCode 
• datasetName 
• basisOfRecord 
• dynamicProperty 
• catalogNumber 
• recordedBy 
• sex 
• preparations 
• otherCatalogNumbers 
• associatedMedia 
• associatedReferences 
• associatedSequences 
• eventDate 
• year 
• month 
• day 
• fieldNumber 
• eventRemarks 
• higherGeography 
• continent 
• waterBody 
• islandGroup 
• island 
• country 
• stateProvince 
• county 
• locality 
• minimumDepthInMeters 
• maximumDepthInMeters 
• locationRemarks 
• decimalLatitude 
• decimalLongitude 
• geodeticDatum 
• coordinateUncertaintyIn 
Meters 
• georeferencedBy 
• georeferencedDate 
• georeferenceSources 
• georeferenceRemarks 
• identifiedBy 
• dateIdentified 
• typeStatus 
• scientificName 
• kingdom 
• phylum 
• class 
• order 
• family 
• genus 
• specificEpithet 
• infraspecificEpithet 
• scientificNameAuthorship 
MATERIAL SAMPLE 
& ORGANISM
Properties in an example Darwin 
Core record 
• occurrenceID 
• modified 
• rights 
• institutionCode 
• collectionCode 
• datasetName 
• basisOfRecord 
• dynamicProperty 
• catalogNumber 
• recordedBy 
• sex 
• preparations 
• otherCatalogNumbers 
• associatedMedia 
• associatedReferences 
• associatedSequences 
• eventDate 
• year 
• month 
• day 
• fieldNumber 
• eventRemarks 
• higherGeography 
• continent 
• waterBody 
• islandGroup 
• island 
• country 
• stateProvince 
• county 
• locality 
• minimumDepthInMeters 
• maximumDepthInMeters 
• locationRemarks 
• decimalLatitude 
• decimalLongitude 
• geodeticDatum 
• coordinateUncertaintyIn 
Meters 
• georeferencedBy 
• georeferencedDate 
• georeferenceSources 
• georeferenceRemarks 
• identifiedBy 
• dateIdentified 
• typeStatus 
• scientificName 
• kingdom 
• phylum 
• class 
• order 
• family 
• genus 
• specificEpithet 
• infraspecificEpithet 
• scientificNameAuthorship 
EVENT & 
OCCURRENCE
Properties in an example Darwin 
Core record 
• occurrenceID 
• modified 
• rights 
• institutionCode 
• collectionCode 
• datasetName 
• basisOfRecord 
• dynamicProperty 
• catalogNumber 
• recordedBy 
• sex 
• preparations 
• otherCatalogNumbers 
• associatedMedia 
• associatedReferences 
• associatedSequences 
• eventDate 
• year 
• month 
• day 
• fieldNumber 
• eventRemarks 
• higherGeography 
• continent 
• waterBody 
• islandGroup 
• island 
• country 
• stateProvince 
• county 
• locality 
• minimumDepthInMeters 
• maximumDepthInMeters 
• locationRemarks 
• decimalLatitude 
• decimalLongitude 
• geodeticDatum 
• coordinateUncertaintyIn 
Meters 
• georeferencedBy 
• georeferencedDate 
• georeferenceSources 
• georeferenceRemarks 
• identifiedBy 
• dateIdentified 
• typeStatus 
• scientificName 
• kingdom 
• phylum 
• class 
• order 
• family 
• genus 
• specificEpithet 
• infraspecificEpithet 
• scientificNameAuthorship 
LOCATION
Properties in an example Darwin 
Core record 
• occurrenceID 
• modified 
• rights 
• institutionCode 
• collectionCode 
• datasetName 
• basisOfRecord 
• dynamicProperty 
• catalogNumber 
• recordedBy 
• sex 
• preparations 
• otherCatalogNumbers 
• associatedMedia 
• associatedReferences 
• associatedSequences 
• eventDate 
• year 
• month 
• day 
• fieldNumber 
• eventRemarks 
• higherGeography 
• continent 
• waterBody 
• islandGroup 
• island 
• country 
• stateProvince 
• county 
• locality 
• minimumDepthInMeters 
• maximumDepthInMeters 
• locationRemarks 
• decimalLatitude 
• decimalLongitude 
• geodeticDatum 
• coordinateUncertaintyIn 
Meters 
• georeferencedBy 
• georeferencedDate 
• georeferenceSources 
• georeferenceRemarks 
• identifiedBy 
• dateIdentified 
• typeStatus 
• scientificName 
• kingdom 
• phylum 
• class 
• order 
• family 
• genus 
• specificEpithet 
• infraspecificEpithet 
• scientificNameAuthorship 
IDENTIFICATION/TAXON
Using DwC properties in BCO: 
Event as an example
Material entities, information entities, and 
processes in the Basic Formal Ontology
Mapping DwC classes to BCO: 
basisOfRecord terms as an example
How to create RDF triples (using Ontology terms) for 
biodiversity data 
Check for an easy way first! 
See if you can use the BiSciCol triplifier (http://biscicol.org/triplifier/) or similar tool that 
automates file conversion for specific formats. If not, proceed. 
Create Mapping File 
• Create groups of columns and assign to relevant classes 
• Define columns containing a URI identifier for each class within each distinct record. 
• If you’re not importing an existing ontology, create relationships between classes 
Assemble into Mapping File, the format depending on the tool used in the next step. 
Use Conversion Tool 
Check out WebKarma (http://www.isi.edu/integration/karma/) or D2RQ (http://d2rq.org/). 
Send to Triple-Store 
Upload data to a Triple-Store or SPARQL Endpoint (e.g Virtuoso http://www.openlinksw.com/) 
http://www.wikihow.com/Create-RDF-Triples-%28Using-Ontology-Terms%29-for-Biodiversity-Data
Specimen data from a Darwin Core 
Archive: VertNet
iMicrobe data links specimens to metagenomic 
sequences and environmental parameters 
Collecting event: 
location 
depth 
weather 
cruise 
biome 
site description 
temperature 
… 
* 
* 
* 
Metagenomic 
sequence: 
library accession # 
sequencing method 
molecule type 
number of reads 
… 
Parameters: 
salinity 
pH 
fluorescence 
turbidity 
sample volume 
silicate 
oxygen 
dissolved organic carbon 
….
iMicrobe data mapped to BCO
Linking prospective data to ontologies 
is much easier! 
quer 
y
Conclusions 
• BCO can work across different data types, not just 
for DwC. 
• The work of producing BCO has forced us to look 
at DwC definitions more rigorously. 
• BCO provides an opportunity to manage parts of 
the DwC vocabulary as controlled vocabularies 
that are rigorously, logically defined. 
– example: basisOfRecord 
• Road map for this work includes the intention to 
propose BCO as a TDWG standard.
Acknowledgments 
• Dozens of participants at BCO workshops and 
hackathons over the past two years 
• NSF-EAGER: An Interoperable Information 
Infrastructure for Biodiversity Research (I3BR) 
• NSF: Research Coordination Network for GSC 
(RCN4GSC) 
• Gordon and Betty Moore Foundation (iMicrobe) 
• VertNet 
• University of Kansas Biodiversity Institute
Using the Biological Collections Ontology to Advance Biodiversity Science
Using the Biological Collections Ontology to Advance Biodiversity Science

More Related Content

PPT
farmland
PPTX
collection and packing of biological and firearm evidences
PDF
Separate collection ang biological treatment of food and bio-waste in a circu...
PDF
The role of natural history collections data in documenting the biological an...
PDF
61330926 compilation-of-biology-essays-updated
PDF
Convention on Biological Diversity
PPT
Collection of biological specimens for microbiology tests
DOCX
Principles of systematic zoology
farmland
collection and packing of biological and firearm evidences
Separate collection ang biological treatment of food and bio-waste in a circu...
The role of natural history collections data in documenting the biological an...
61330926 compilation-of-biology-essays-updated
Convention on Biological Diversity
Collection of biological specimens for microbiology tests
Principles of systematic zoology

Similar to Using the Biological Collections Ontology to Advance Biodiversity Science (20)

PDF
Event core and new datatypes in GBIF - 10th European GBIF Nodes Meeting in Ta...
PDF
GBIF BIFA mentoring, Day 2 Publish data, July 2016
PDF
How the Web of Data Will be Won
PPT
Persistent identifiers for digitized specimens (2013)
PDF
Integrated Earth Data Applications: Enhancing Reliable Data Services Through ...
PDF
Global Biodiversity Information Facility (GBIF) - 2012
PDF
FISHLink Presentation at JISC MRD Workshop
PDF
Aus cover perth 6 june 2016
PDF
Persistent Identifiers, Herbarium workshop at Kongsvold, September 1 to 4, 2014
PPT
130712 antabif workshop
KEY
NISO Forum, Denver, Sept. 24, 2012: Data Equivalence
PPTX
Shorthouse - Authority Management of People Names Workshop
PPTX
Lehnert_EGU201_SampleMetadataStandards
PPTX
Metadata for compound objects | training
PPT
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
PDF
Identity, Location, and Citation at NEON
PPT
Levin Development of a Database to Manage and Analyze Publications
PPT
Yu ecn2013 cnc_databasing
PPTX
DART project
PPTX
Craig Walker & Peter Doherty_Soils-to-Satellites: National capabilities worki...
Event core and new datatypes in GBIF - 10th European GBIF Nodes Meeting in Ta...
GBIF BIFA mentoring, Day 2 Publish data, July 2016
How the Web of Data Will be Won
Persistent identifiers for digitized specimens (2013)
Integrated Earth Data Applications: Enhancing Reliable Data Services Through ...
Global Biodiversity Information Facility (GBIF) - 2012
FISHLink Presentation at JISC MRD Workshop
Aus cover perth 6 june 2016
Persistent Identifiers, Herbarium workshop at Kongsvold, September 1 to 4, 2014
130712 antabif workshop
NISO Forum, Denver, Sept. 24, 2012: Data Equivalence
Shorthouse - Authority Management of People Names Workshop
Lehnert_EGU201_SampleMetadataStandards
Metadata for compound objects | training
iEvoBio Keynote: Frontiers of discovery with Encyclopedia of Life -- TRAITBANK
Identity, Location, and Citation at NEON
Levin Development of a Database to Manage and Analyze Publications
Yu ecn2013 cnc_databasing
DART project
Craig Walker & Peter Doherty_Soils-to-Satellites: National capabilities worki...
Ad

Recently uploaded (20)

PPTX
Taita Taveta Laboratory Technician Workshop Presentation.pptx
PDF
Biophysics 2.pdffffffffffffffffffffffffff
PDF
An interstellar mission to test astrophysical black holes
PDF
CHAPTER 3 Cell Structures and Their Functions Lecture Outline.pdf
PDF
Looking into the jet cone of the neutrino-associated very high-energy blazar ...
PPTX
POULTRY PRODUCTION AND MANAGEMENTNNN.pptx
PPTX
Application of enzymes in medicine (2).pptx
PPT
POSITIONING IN OPERATION THEATRE ROOM.ppt
PPTX
Fluid dynamics vivavoce presentation of prakash
PDF
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
PDF
Warm, water-depleted rocky exoplanets with surfaceionic liquids: A proposed c...
PPTX
Science Quipper for lesson in grade 8 Matatag Curriculum
PDF
Phytochemical Investigation of Miliusa longipes.pdf
PPTX
neck nodes and dissection types and lymph nodes levels
PDF
Sciences of Europe No 170 (2025)
PDF
Formation of Supersonic Turbulence in the Primordial Star-forming Cloud
PDF
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
PPTX
TOTAL hIP ARTHROPLASTY Presentation.pptx
PPTX
2. Earth - The Living Planet earth and life
PPTX
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
Taita Taveta Laboratory Technician Workshop Presentation.pptx
Biophysics 2.pdffffffffffffffffffffffffff
An interstellar mission to test astrophysical black holes
CHAPTER 3 Cell Structures and Their Functions Lecture Outline.pdf
Looking into the jet cone of the neutrino-associated very high-energy blazar ...
POULTRY PRODUCTION AND MANAGEMENTNNN.pptx
Application of enzymes in medicine (2).pptx
POSITIONING IN OPERATION THEATRE ROOM.ppt
Fluid dynamics vivavoce presentation of prakash
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
Warm, water-depleted rocky exoplanets with surfaceionic liquids: A proposed c...
Science Quipper for lesson in grade 8 Matatag Curriculum
Phytochemical Investigation of Miliusa longipes.pdf
neck nodes and dissection types and lymph nodes levels
Sciences of Europe No 170 (2025)
Formation of Supersonic Turbulence in the Primordial Star-forming Cloud
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
TOTAL hIP ARTHROPLASTY Presentation.pptx
2. Earth - The Living Planet earth and life
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
Ad

Using the Biological Collections Ontology to Advance Biodiversity Science

  • 1. Using the Biological Collections Ontology to Advance Biodiversity Science TDWG 2014, Jönköping, Sweden Ramona Walls John Wieczorek Robert Guralnick John Deck
  • 2. Overview 1. How we model biodiversity information in the Biological Collections Ontology 2. Integrating ontologies into biodiversity information workflows
  • 3. Properties in an example Darwin Core record • occurrenceID • modified • rights • institutionCode • collectionCode • datasetName • basisOfRecord • dynamicProperty • catalogNumber • recordedBy • sex • preparations • otherCatalogNumbers • associatedMedia • associatedReferences • associatedSequences • eventDate • year • month • day • fieldNumber • eventRemarks • higherGeography • continent • waterBody • islandGroup • island • country • stateProvince • county • locality • minimumDepthInMeters • maximumDepthInMeters • locationRemarks • decimalLatitude • decimalLongitude • geodeticDatum • coordinateUncertaintyIn Meters • georeferencedBy • georeferencedDate • georeferenceSources • georeferenceRemarks • identifiedBy • dateIdentified • typeStatus • scientificName • kingdom • phylum • class • order • family • genus • specificEpithet • infraspecificEpithet • scientificNameAuthorship
  • 4. Properties in an example Darwin Core record • occurrenceID • modified • rights • institutionCode • collectionCode • datasetName • basisOfRecord • dynamicProperty • catalogNumber • recordedBy • sex • preparations • otherCatalogNumbers • associatedMedia • associatedReferences • associatedSequences • eventDate • year • month • day • fieldNumber • eventRemarks • higherGeography • continent • waterBody • islandGroup • island • country • stateProvince • county • locality • minimumDepthInMeters • maximumDepthInMeters • locationRemarks • decimalLatitude • decimalLongitude • geodeticDatum • coordinateUncertaintyIn Meters • georeferencedBy • georeferencedDate • georeferenceSources • georeferenceRemarks • identifiedBy • dateIdentified • typeStatus • scientificName • kingdom • phylum • class • order • family • genus • specificEpithet • infraspecificEpithet • scientificNameAuthorship RECORD
  • 5. Properties in an example Darwin Core record • occurrenceID • modified • rights • institutionCode • collectionCode • datasetName • basisOfRecord • dynamicProperty • catalogNumber • recordedBy • sex • preparations • otherCatalogNumbers • associatedMedia • associatedReferences • associatedSequences • eventDate • year • month • day • fieldNumber • eventRemarks • higherGeography • continent • waterBody • islandGroup • island • country • stateProvince • county • locality • minimumDepthInMeters • maximumDepthInMeters • locationRemarks • decimalLatitude • decimalLongitude • geodeticDatum • coordinateUncertaintyIn Meters • georeferencedBy • georeferencedDate • georeferenceSources • georeferenceRemarks • identifiedBy • dateIdentified • typeStatus • scientificName • kingdom • phylum • class • order • family • genus • specificEpithet • infraspecificEpithet • scientificNameAuthorship MATERIAL SAMPLE & ORGANISM
  • 6. Properties in an example Darwin Core record • occurrenceID • modified • rights • institutionCode • collectionCode • datasetName • basisOfRecord • dynamicProperty • catalogNumber • recordedBy • sex • preparations • otherCatalogNumbers • associatedMedia • associatedReferences • associatedSequences • eventDate • year • month • day • fieldNumber • eventRemarks • higherGeography • continent • waterBody • islandGroup • island • country • stateProvince • county • locality • minimumDepthInMeters • maximumDepthInMeters • locationRemarks • decimalLatitude • decimalLongitude • geodeticDatum • coordinateUncertaintyIn Meters • georeferencedBy • georeferencedDate • georeferenceSources • georeferenceRemarks • identifiedBy • dateIdentified • typeStatus • scientificName • kingdom • phylum • class • order • family • genus • specificEpithet • infraspecificEpithet • scientificNameAuthorship EVENT & OCCURRENCE
  • 7. Properties in an example Darwin Core record • occurrenceID • modified • rights • institutionCode • collectionCode • datasetName • basisOfRecord • dynamicProperty • catalogNumber • recordedBy • sex • preparations • otherCatalogNumbers • associatedMedia • associatedReferences • associatedSequences • eventDate • year • month • day • fieldNumber • eventRemarks • higherGeography • continent • waterBody • islandGroup • island • country • stateProvince • county • locality • minimumDepthInMeters • maximumDepthInMeters • locationRemarks • decimalLatitude • decimalLongitude • geodeticDatum • coordinateUncertaintyIn Meters • georeferencedBy • georeferencedDate • georeferenceSources • georeferenceRemarks • identifiedBy • dateIdentified • typeStatus • scientificName • kingdom • phylum • class • order • family • genus • specificEpithet • infraspecificEpithet • scientificNameAuthorship LOCATION
  • 8. Properties in an example Darwin Core record • occurrenceID • modified • rights • institutionCode • collectionCode • datasetName • basisOfRecord • dynamicProperty • catalogNumber • recordedBy • sex • preparations • otherCatalogNumbers • associatedMedia • associatedReferences • associatedSequences • eventDate • year • month • day • fieldNumber • eventRemarks • higherGeography • continent • waterBody • islandGroup • island • country • stateProvince • county • locality • minimumDepthInMeters • maximumDepthInMeters • locationRemarks • decimalLatitude • decimalLongitude • geodeticDatum • coordinateUncertaintyIn Meters • georeferencedBy • georeferencedDate • georeferenceSources • georeferenceRemarks • identifiedBy • dateIdentified • typeStatus • scientificName • kingdom • phylum • class • order • family • genus • specificEpithet • infraspecificEpithet • scientificNameAuthorship IDENTIFICATION/TAXON
  • 9. Using DwC properties in BCO: Event as an example
  • 10. Material entities, information entities, and processes in the Basic Formal Ontology
  • 11. Mapping DwC classes to BCO: basisOfRecord terms as an example
  • 12. How to create RDF triples (using Ontology terms) for biodiversity data Check for an easy way first! See if you can use the BiSciCol triplifier (http://biscicol.org/triplifier/) or similar tool that automates file conversion for specific formats. If not, proceed. Create Mapping File • Create groups of columns and assign to relevant classes • Define columns containing a URI identifier for each class within each distinct record. • If you’re not importing an existing ontology, create relationships between classes Assemble into Mapping File, the format depending on the tool used in the next step. Use Conversion Tool Check out WebKarma (http://www.isi.edu/integration/karma/) or D2RQ (http://d2rq.org/). Send to Triple-Store Upload data to a Triple-Store or SPARQL Endpoint (e.g Virtuoso http://www.openlinksw.com/) http://www.wikihow.com/Create-RDF-Triples-%28Using-Ontology-Terms%29-for-Biodiversity-Data
  • 13. Specimen data from a Darwin Core Archive: VertNet
  • 14. iMicrobe data links specimens to metagenomic sequences and environmental parameters Collecting event: location depth weather cruise biome site description temperature … * * * Metagenomic sequence: library accession # sequencing method molecule type number of reads … Parameters: salinity pH fluorescence turbidity sample volume silicate oxygen dissolved organic carbon ….
  • 16. Linking prospective data to ontologies is much easier! quer y
  • 17. Conclusions • BCO can work across different data types, not just for DwC. • The work of producing BCO has forced us to look at DwC definitions more rigorously. • BCO provides an opportunity to manage parts of the DwC vocabulary as controlled vocabularies that are rigorously, logically defined. – example: basisOfRecord • Road map for this work includes the intention to propose BCO as a TDWG standard.
  • 18. Acknowledgments • Dozens of participants at BCO workshops and hackathons over the past two years • NSF-EAGER: An Interoperable Information Infrastructure for Biodiversity Research (I3BR) • NSF: Research Coordination Network for GSC (RCN4GSC) • Gordon and Betty Moore Foundation (iMicrobe) • VertNet • University of Kansas Biodiversity Institute

Editor's Notes

  • #2: Ramona and introductions
  • #3: Ramona
  • #4: JOhn Show typical metadata and how it confounds material entities and process and why this is a problem.
  • #5: JOhn Show typical metadata and how it confounds material entities and process and why this is a problem.
  • #6: John Show typical metadata and how it confounds material entities and process and why this is a problem.
  • #7: John
  • #8: John
  • #9: John
  • #10: Ramona
  • #11: Ramona separation of processes, material entities, information content entities and how they link to one another
  • #12: John Show structure of specimens and observations
  • #13: Ramona
  • #14: John
  • #15: Ramona
  • #17: Ramona data collection spreadsheet with an ontology behind it to a triple store you can query to new discoveries!
  • #18: John
  • #21: Keep this at the end as an example