A Schema for Description and Exchange of TaxonomicPublication's ContentDonat Agosti, Terry Catapano, Lyubomir Penev & Guido SautterPlazi, Bern, Switzerland25. July 2011, IBC, Melbourne
WHY?
disseminateaccessknowledge
New York Times, July 19, 2011
“JSTOR's the one that should be in prison, man, for locking up knowledge.”Hufpost Politics, July 19, 2011http://www.huffingtonpost.com/2011/07/19/huffpost-hill----gang-vio_n_904027.html
OpenAccess
An example from the Neurocommons text mining pilot: PubMed abstracts: > 16,000,000
 CNS classified abstracts: 874,727
 text mining recognized: 368,688
 text mining processed: 94,381
 extracted graph of 30,000+ relationships and 5,500 genes and proteins“protein-protein interaction networks”John Wilbanks,                 	    Neurocommons
27,266 papers128,437 papers41,985 papers4,563 papers10,365 papersIn a semantic Web environment (where machines talk to each other and do most of our work), data need to be able to talk to each other:“protein-protein interaction networks”John Wilbanks,                 	    Neurocommons
“protein-protein interaction networks”John Wilbanks,                 	    NeurocommonsIt will open up scientific literature for data mining
HOW?
accessfor human ANDmachine
It is about digesting millions of pages: >>100 M pages taxonomic literature25M scientific publications / year25K journals>2K with zoological taxonomic descriptions18K descriptions of new species / year
PDF is not enough
data and information in context
semantic markup
context of content
XMLeXtended Markup Language
<tax:treatment>  <tax:nomenclature>    <tax:name>       <tax:xid source="HNS" identifier="193329"/>       <tax:xmldata>          <dc:Genus>Mystrium</dc:Genus>          <dc:Species>leonie</dc:Species>       </tax:xmldata>       Mystrium leonie Bihn & Verhaagh, new species     </tax:name>     <tax:status>n. sp.</tax:status>       Fig 1 D - F   </tax:nomenclature>   <tax:div type="description">     <tax:p>HOLOTYPE WORKER: TL 3.95, HL 1.02, HW 0.95, CI 93, SL        1.30, SI 137, PW 0.73, ML 0.38. Mandible outer margin strongly curving        to a sharp apical tooth, the apex parallel to the anterior clypeal margin.        (Holotype with material in mandibles, so mandibles and anterior clypeusdescribed below from paratypes.) Median clypeus....</treatment>
content in a complex e-environment
linking
Azteca instabilisWould then read like<tax:name><tax:xid source=“LSID" identifier=“urn:lsid:biosci.ohio-state.edu.osuc_concetps:13452"/>                 	   Link to external database<tax:xmldata>                                                                Normalization of data     <dc:Genus>Azteca</dc:Genus>     <dc:Species>instabilis</dc:Species>   </tax:xmldata>Azteca instabilis</tax:name>
definition of XML tagsDTDschema
transformations from XMLhtmlprintpdfarchivingrdfdatabase
legacyTaxonXTaxpubprospective
how to use XML?
legacy publications
Plazi workflow: GoldenGate editor based mark up and linking- Get LSID  from Hymenoptera Name Server for names; ZooBank?Add new names - Get bibliographic Guids from bioguid (or EDIT?)- Get bibliographic Metadata from HNS (MODS)Get Guids for
 CBOL
 NCBI
 specimen
 images
 .....- Get geographic long/lat from geonames.orgLegacy publications
linked data

More Related Content

PDF
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
PDF
[Conference] Cognitive Graph Analytics on Company Data and News
PPTX
Analytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data Linking
ODP
2009 0807 Lod Gmod
PDF
Entity Linking, Link Prediction, and Knowledge Graph Completion
PPT
20130206 open refine
PDF
MongoDB and the Connectivity Map: Making Connections Between Genetics and Dis...
PDF
Open Research Knowledge Graph (ORKG) - an overview
Reasoning with Big Knowledge Graphs: Choices, Pitfalls and Proven Recipes
[Conference] Cognitive Graph Analytics on Company Data and News
Analytics on Big Knowledge Graphs Deliver Entity Awareness and Help Data Linking
2009 0807 Lod Gmod
Entity Linking, Link Prediction, and Knowledge Graph Completion
20130206 open refine
MongoDB and the Connectivity Map: Making Connections Between Genetics and Dis...
Open Research Knowledge Graph (ORKG) - an overview

Viewers also liked (19)

PPS
Meaning of devotee
PPS
Silence - way to reach God
PDF
Information for DMCs (Draft)
PPTX
Arham Yuva Group - Handicap chinchan project
PDF
20110222 behesty monitoring and measuring biodiversity
PPT
Divine Love
PPT
Web 2 and CEFE
PPS
Think it
PPS
Way of living life
PPT
20140523 swiss curators_bouchout_2
PPS
Importance of Guru
PPTX
Emerging Technology Trends in OD
PPTX
Revolutionizing the Research on Ants through new Methods and Technologies: th...
PPS
Gas Prices
PPSX
Stucky Rwagasana Presentation
PPS
Save Water
PPS
PPS
Montreal Garden
PPS
Tenerife Full
Meaning of devotee
Silence - way to reach God
Information for DMCs (Draft)
Arham Yuva Group - Handicap chinchan project
20110222 behesty monitoring and measuring biodiversity
Divine Love
Web 2 and CEFE
Think it
Way of living life
20140523 swiss curators_bouchout_2
Importance of Guru
Emerging Technology Trends in OD
Revolutionizing the Research on Ants through new Methods and Technologies: th...
Gas Prices
Stucky Rwagasana Presentation
Save Water
Montreal Garden
Tenerife Full
Ad

Similar to 20110725 ibc xml (20)

PPT
20090921 Art Databanken Agosti Final
PPT
20110122 vibrant final
PPT
Setting the Scene for ViBRANT – Strategy, Philosophy and Communication
PDF
A Step Towards (From) Read to Write Access to Taxonomic Publications
PDF
Nothing in taxonomy makes sense except in the light of Open Access
PDF
20140317 pi b_nmbe_journal_club
PDF
Unleash the Potential of your Website! 180,000 webpages from the French NHM m...
PDF
Workshop 5: Uptake of, and concepts in text and data mining
PPTX
Visualizing Primary Data form Taxonomic Literature
PPTX
ContentMining for Synthetic Biology
PPTX
ContentMining for Synthetic Biology
PPTX
Franz et al 2015 escjam 2015 logic resolution taxonomic variable
PPT
Sherborn: Lyal - Digitising legacy taxonomic literature: processes, products ...
PPT
Writing The Encyclopedia Of Life (not EoL.org)
PPTX
FAIRsharing Keynote - International Workshop on Sharing, Citation and Publica...
PPTX
20140327 rda plazi_final
PPTX
Sherborn: Penev - ZooKeys: Streamlining the registration – to – publication p...
PDF
2 donat agosti-1
PPTX
ContentMine (TDM) at JISC Digifest
PPTX
Liberating facts from the scientific literature - Jisc Digifest 2016
20090921 Art Databanken Agosti Final
20110122 vibrant final
Setting the Scene for ViBRANT – Strategy, Philosophy and Communication
A Step Towards (From) Read to Write Access to Taxonomic Publications
Nothing in taxonomy makes sense except in the light of Open Access
20140317 pi b_nmbe_journal_club
Unleash the Potential of your Website! 180,000 webpages from the French NHM m...
Workshop 5: Uptake of, and concepts in text and data mining
Visualizing Primary Data form Taxonomic Literature
ContentMining for Synthetic Biology
ContentMining for Synthetic Biology
Franz et al 2015 escjam 2015 logic resolution taxonomic variable
Sherborn: Lyal - Digitising legacy taxonomic literature: processes, products ...
Writing The Encyclopedia Of Life (not EoL.org)
FAIRsharing Keynote - International Workshop on Sharing, Citation and Publica...
20140327 rda plazi_final
Sherborn: Penev - ZooKeys: Streamlining the registration – to – publication p...
2 donat agosti-1
ContentMine (TDM) at JISC Digifest
Liberating facts from the scientific literature - Jisc Digifest 2016
Ad

More from agosti (13)

PPTX
DOI and the Mitteilungen: communicating scientific results in the future
PPTX
Data Sharing Principles and Legal Interoperability for Essential Biodiversity...
PDF
BioDIP - a proposed infrastructure to link the taxonomic to the genomic and o...
PPTX
Open Research Data: Taxonomy
PPTX
20150701 opendata bern_agosti_2
PPTX
Plazi or the challenge to free biodiversity data caught in hundreds of millio...
PPTX
20141027 bouchout declaration
PPT
20140924 rda _bouchout
PPTX
20140922 rda codata_legal_ig_plazi_final
PPTX
Agosti 20140813 icd8_agosti_global_dipterology-2
PPT
Bouchout Declaration on Open Biodiversity Knowledge Management, Montpellier J...
PPT
Bouchout Declaration on Open Biodiversity Knowledge Management, Montpellier J...
PDF
20140623 swets agosti_final
DOI and the Mitteilungen: communicating scientific results in the future
Data Sharing Principles and Legal Interoperability for Essential Biodiversity...
BioDIP - a proposed infrastructure to link the taxonomic to the genomic and o...
Open Research Data: Taxonomy
20150701 opendata bern_agosti_2
Plazi or the challenge to free biodiversity data caught in hundreds of millio...
20141027 bouchout declaration
20140924 rda _bouchout
20140922 rda codata_legal_ig_plazi_final
Agosti 20140813 icd8_agosti_global_dipterology-2
Bouchout Declaration on Open Biodiversity Knowledge Management, Montpellier J...
Bouchout Declaration on Open Biodiversity Knowledge Management, Montpellier J...
20140623 swets agosti_final

Recently uploaded (20)

PDF
OpenACC and Open Hackathons Monthly Highlights July 2025
PDF
CloudStack 4.21: First Look Webinar slides
PDF
Architecture types and enterprise applications.pdf
PDF
A Late Bloomer's Guide to GenAI: Ethics, Bias, and Effective Prompting - Boha...
PPTX
Modernising the Digital Integration Hub
PDF
A review of recent deep learning applications in wood surface defect identifi...
PDF
Consumable AI The What, Why & How for Small Teams.pdf
PDF
Taming the Chaos: How to Turn Unstructured Data into Decisions
PDF
1 - Historical Antecedents, Social Consideration.pdf
PPTX
Final SEM Unit 1 for mit wpu at pune .pptx
PDF
Hindi spoken digit analysis for native and non-native speakers
PDF
Abstractive summarization using multilingual text-to-text transfer transforme...
PPT
Module 1.ppt Iot fundamentals and Architecture
PPTX
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
PDF
A proposed approach for plagiarism detection in Myanmar Unicode text
PDF
A contest of sentiment analysis: k-nearest neighbor versus neural network
PPTX
Microsoft Excel 365/2024 Beginner's training
PDF
Five Habits of High-Impact Board Members
PPTX
Benefits of Physical activity for teenagers.pptx
PDF
TrustArc Webinar - Click, Consent, Trust: Winning the Privacy Game
OpenACC and Open Hackathons Monthly Highlights July 2025
CloudStack 4.21: First Look Webinar slides
Architecture types and enterprise applications.pdf
A Late Bloomer's Guide to GenAI: Ethics, Bias, and Effective Prompting - Boha...
Modernising the Digital Integration Hub
A review of recent deep learning applications in wood surface defect identifi...
Consumable AI The What, Why & How for Small Teams.pdf
Taming the Chaos: How to Turn Unstructured Data into Decisions
1 - Historical Antecedents, Social Consideration.pdf
Final SEM Unit 1 for mit wpu at pune .pptx
Hindi spoken digit analysis for native and non-native speakers
Abstractive summarization using multilingual text-to-text transfer transforme...
Module 1.ppt Iot fundamentals and Architecture
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
A proposed approach for plagiarism detection in Myanmar Unicode text
A contest of sentiment analysis: k-nearest neighbor versus neural network
Microsoft Excel 365/2024 Beginner's training
Five Habits of High-Impact Board Members
Benefits of Physical activity for teenagers.pptx
TrustArc Webinar - Click, Consent, Trust: Winning the Privacy Game

20110725 ibc xml