SlideShare a Scribd company logo
Biodiversity DataBiodiversity Data
vs. the Web 2.0vs. the Web 2.0
OR
How I learned to stop worrying and
love the “systems”
Ana Dal Molin
J. B. Woolley
Texas A&M University
Source: Opte.org
Jan 2005
[ Why this talk ]
• Data providers
• Aggregators
• Tools
• etc
“growth in bioinformatics data
exceeded Moore’s Law, the well-
known observation that the number
of transistors on a chip doubles
every 18 months.”
(Butte, 2001, TRENDS in Biotechnology 19(5))
• Johnson, N. 2007. Annual Rev. Entomology
• http://www.ala.org.au/about-the-atlas/downloadable-tools/tools-review/
• IDigBio
47*
[ what do I use? ]
• Museums often have already decided on a
model/database system
• Each researcher, on the other hand, may not
have, so questions
– Content management systems (CMS)?
– Which output?
– Stability?
– Best practices?
‘systems’ available
• First Generation: desktop-based (MS Access,
FileMaker)
• Second Generation: desktop-based with web output
• Third Generation: content management systems
(PHP, Ruby, MySql, etc.)
Data Accessibility
Your data on the ‘net
• Reach
• Model
GBIF species distribution data coverage (2010)
[ ? ]
Metadata
Data
Metadata
repository
Name IndexOccurrence Index
Yellow PagesRegional Atlas
Annotation Tools
Biosecurity Portal
Analysis Tools
Products
LaSalle, 2008. Atlas of Living Australia, ICE2008 presentation
[ where do I stand? ]
• Taxonomy as 2-natured science
• Shifts in media format
Web 1.0 -> Web 3.0
 1.0: Static HTML, e-mail, forums, chat
 2.0: Dynamic HTML, Wikis, blogging,
commenting, social networking
 3.0: …
*You and your work are not invisible before
publication*
• Web 3.0:
– “Social”
– Tags
– Cloud computing
– Ubiquitous connectivity
– Open technologies, open data formats (and open identity
too)
– Publishing in languages specifically designed for data
(databases, markup)
– Semantic web
– Marketing
http://www.tdwg.org
• What the user wants • What you have to deal
with
*
*not done!
Think it through
Books
 Gutenberg
 Gutenberg Project
 WordCat
 Hashi Trust
The way we collect information is different
The way we accumulate information is
different
The way we understand information is
different
… or not
Jan/2012
33%USA, 20% Brazil, 26% Europe
(Germany, Sweden, Spain, Greece, UK)
Da molin databases_ecn_2012
1.0 2.0
• Web 3.0
1. People lie
2. People are lazy
3. People are stupid
4. Mission: impossible – know
thyself
5. Schemas aren’t neutral
6. Metrics influence results
7. There’s more than one way to
describe something
C. Doctorow, Metacrap, 2001
Issues
• “Unification”*
is not going to happen – curators and
researchers will always have their own
– (although often largely overlapping) set of crucial
information fields which can be cross-linked
• These days, it is imperative that databases
communicate with each other
• ‘unitary taxonomy’ is also not possible and any big
database needs to allow the system to display
conflicting ideas
* Thomas, C. “Biodiversity databases spread, prompting unification
call”, Science v. 325 (2009)
** http://hymao.org
Data ephemerality
• Local vs. Web data
?!
Source: Wikipedia, “Science 2.0”
Data ephemerality
• Digital data preservation: Internet Archive, IIPC
• Library of Congress discussions and recommendations
– Disclosure, Adoption, Transparency , External dependency, Technical
protection
• http://www.digitalpreservation.gov/formats
Da molin databases_ecn_2012
Da molin databases_ecn_2012
 User perspective
 “Incomplete” sites
 Dynamic information
 Selective information?
Why I am not a luddite:
 Online databases are taxonomic product and
marketing for your work
 Online biodiversity databases complement your
work
 But it’s up to you to be able to make the user
understand that your work is more than that
 The user of online databases is probably not the
same as the person who will get your paper
summing up
• Choose the system based on reports you want/need to
deliver
… or work with a journal/team that can help you
• Make sure the system is flexible enough in your hands
• Decide who will do the maintenance of your data
– How big is your team?
– Fluidity (positive and negative)
• Think about stability and backup strategies
Thanks!!

More Related Content

PPTX
From Open Data to Open Science, by Geoffrey Boulton
PPTX
DigiCCurr 2013 PhD Workshop - Citizen Science and Data Curation: Who needs what?
PPTX
The culture of researchData
PDF
Scientists go online: Scientific social networking and Open Notebook Science
PDF
New and Emerging Forms of Data
PPTX
Brown Bag: New Models of Scholarly Communication for Digital Scholarship, by ...
PPTX
The Challenges of Making Data Travel, by Sabina Leonelli
PDF
David De Roure - What's so different about Arts and Humanities data?
From Open Data to Open Science, by Geoffrey Boulton
DigiCCurr 2013 PhD Workshop - Citizen Science and Data Curation: Who needs what?
The culture of researchData
Scientists go online: Scientific social networking and Open Notebook Science
New and Emerging Forms of Data
Brown Bag: New Models of Scholarly Communication for Digital Scholarship, by ...
The Challenges of Making Data Travel, by Sabina Leonelli
David De Roure - What's so different about Arts and Humanities data?

What's hot (20)

PPTX
Bw dave pattern lidp
PPTX
Open Data and Open Science
PDF
Sally Rumsey, Janet McKnight, James A.J. Wilson - Research data management fo...
PDF
Building Capacity for Open Science
PPTX
Scratchpads introductory presentation 45mins
PPT
Codes, Clouds & Constellations: Open Science in the Data Decade
PPT
Scott Edmunds & Rob Davidson's talk at the Metabolomics Society 2014 Meeting ...
PDF
Alexander - Education in the Internet of Everything
PPTX
New Forms of Data and Scientific Research
PDF
Shifting Scientific Practice - ORCID 2015
PPT
Information, Science, and Society
PDF
2-6-14 ESI Supplemental Webinar: The Data Information Literacy Project
PPT
Download PPT file
PPTX
Nicole Nogoy at the Auckland BMC RoadShow
PPTX
Complicating the Question of Access (and Value) with University Press Publica...
PPT
Emerging Information Networks: how they are changing practice
PPTX
Data Responsibly: The next decade of data science
PPT
iSGTW - What it is
PPT
Authentic Data and Visualisation: Semantic tools from the Ensemble Project
PPTX
MIT Program on Information Science Talk -- Ophir Frieder on Searching in Hars...
Bw dave pattern lidp
Open Data and Open Science
Sally Rumsey, Janet McKnight, James A.J. Wilson - Research data management fo...
Building Capacity for Open Science
Scratchpads introductory presentation 45mins
Codes, Clouds & Constellations: Open Science in the Data Decade
Scott Edmunds & Rob Davidson's talk at the Metabolomics Society 2014 Meeting ...
Alexander - Education in the Internet of Everything
New Forms of Data and Scientific Research
Shifting Scientific Practice - ORCID 2015
Information, Science, and Society
2-6-14 ESI Supplemental Webinar: The Data Information Literacy Project
Download PPT file
Nicole Nogoy at the Auckland BMC RoadShow
Complicating the Question of Access (and Value) with University Press Publica...
Emerging Information Networks: how they are changing practice
Data Responsibly: The next decade of data science
iSGTW - What it is
Authentic Data and Visualisation: Semantic tools from the Ensemble Project
MIT Program on Information Science Talk -- Ophir Frieder on Searching in Hars...
Ad

Viewers also liked (9)

PPTX
Thomas ecn 2012
PPT
Barclay ecn 2012
PPT
Price2 ecn2013
PPTX
Sikes ecn2013 dn_ab
PPT
Ryder ecn2013
PPTX
Hoekman ecn 2012
PPTX
De walt ecn_2012
PPT
Longino ecn 2012
PPTX
Oboyski cal bug_ecn_2012
Thomas ecn 2012
Barclay ecn 2012
Price2 ecn2013
Sikes ecn2013 dn_ab
Ryder ecn2013
Hoekman ecn 2012
De walt ecn_2012
Longino ecn 2012
Oboyski cal bug_ecn_2012
Ad

Similar to Da molin databases_ecn_2012 (20)

PPTX
The biodiversity informatics landscape: a systematics perspective
PDF
The Biodiversity Informatics Landscape
PPTX
Phyloinformatics and the Semantic Web
PDF
Science and Web2.0
PPTX
Making your data work for you: Scratchpads, publishing & the Biodiversity Dat...
PPT
Biodiversity informatics: why aren't we there yet?
PDF
The web as a tool - rather than a threat
PPTX
Scientific data management from the lab to the web
PPTX
Strategic scenarios in digital content and digital business
PPTX
Vince smith-delivering biodiversity knowledge in the information age-notext
PDF
MOA 2015, Keynote - Open All The Things
PPTX
Building data infrastructures for science
PPT
OUR space: the new world of metadata
PPTX
Preserving the Inputs and Outputs of Scholarship
PPTX
Webs of Life and Data: Impacts of open and networked data on scientific pract...
PPTX
Web Information Systems Introduction and Origin of World Wide Web
PPT
Hearst Faceted Metadata for Site Navigation and Search
PPT
Riding the wave - Paradigm shifts in information access
PPTX
Session 02, Introduction to the 2015 Data Publishing Landscape at the GB22 No...
PDF
Metadata
The biodiversity informatics landscape: a systematics perspective
The Biodiversity Informatics Landscape
Phyloinformatics and the Semantic Web
Science and Web2.0
Making your data work for you: Scratchpads, publishing & the Biodiversity Dat...
Biodiversity informatics: why aren't we there yet?
The web as a tool - rather than a threat
Scientific data management from the lab to the web
Strategic scenarios in digital content and digital business
Vince smith-delivering biodiversity knowledge in the information age-notext
MOA 2015, Keynote - Open All The Things
Building data infrastructures for science
OUR space: the new world of metadata
Preserving the Inputs and Outputs of Scholarship
Webs of Life and Data: Impacts of open and networked data on scientific pract...
Web Information Systems Introduction and Origin of World Wide Web
Hearst Faceted Metadata for Site Navigation and Search
Riding the wave - Paradigm shifts in information access
Session 02, Introduction to the 2015 Data Publishing Landscape at the GB22 No...
Metadata

More from ECNOfficer (20)

PPTX
Janzen ecn2013
PPTX
Nearns ecn2013
PPT
Krell ecn2013
PPTX
D paul ecn2013
PPTX
Giddens ecn2013
PPTX
Rubinoff ecn2013 uhim
PPT
Mc alister ecn2013
PPTX
Dombroskie ecn2013
PPT
Dmitriev ecn2013
PPTX
Oboyski ecn2013
PPT
Thomas ecn2013
PPTX
Jones ecn2013 the_goodbadugly conabio
PPTX
Austin ecn2013
PPT
Yu ecn2013 cnc_databasing
PPT
Solis ecn2013 usfws
PPT
Schuh ecn2013 tcn_data_structure
PPTX
Gil ecn2013 ppt
PPTX
Dm smith ecn2013
PPTX
Abrahamson ecn2013 evaluating_naturalhistorycollectionuse
PPTX
Furth ecn 2013
Janzen ecn2013
Nearns ecn2013
Krell ecn2013
D paul ecn2013
Giddens ecn2013
Rubinoff ecn2013 uhim
Mc alister ecn2013
Dombroskie ecn2013
Dmitriev ecn2013
Oboyski ecn2013
Thomas ecn2013
Jones ecn2013 the_goodbadugly conabio
Austin ecn2013
Yu ecn2013 cnc_databasing
Solis ecn2013 usfws
Schuh ecn2013 tcn_data_structure
Gil ecn2013 ppt
Dm smith ecn2013
Abrahamson ecn2013 evaluating_naturalhistorycollectionuse
Furth ecn 2013

Recently uploaded (20)

PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Mushroom cultivation and it's methods.pdf
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Empathic Computing: Creating Shared Understanding
PDF
A comparative analysis of optical character recognition models for extracting...
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPTX
SOPHOS-XG Firewall Administrator PPT.pptx
PDF
A comparative study of natural language inference in Swahili using monolingua...
PPTX
OMC Textile Division Presentation 2021.pptx
PPTX
Spectroscopy.pptx food analysis technology
PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
A Presentation on Artificial Intelligence
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PDF
Heart disease approach using modified random forest and particle swarm optimi...
PPTX
Machine Learning_overview_presentation.pptx
PDF
August Patch Tuesday
PDF
Network Security Unit 5.pdf for BCA BBA.
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Mushroom cultivation and it's methods.pdf
Encapsulation_ Review paper, used for researhc scholars
NewMind AI Weekly Chronicles - August'25-Week II
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Empathic Computing: Creating Shared Understanding
A comparative analysis of optical character recognition models for extracting...
Building Integrated photovoltaic BIPV_UPV.pdf
Per capita expenditure prediction using model stacking based on satellite ima...
SOPHOS-XG Firewall Administrator PPT.pptx
A comparative study of natural language inference in Swahili using monolingua...
OMC Textile Division Presentation 2021.pptx
Spectroscopy.pptx food analysis technology
Spectral efficient network and resource selection model in 5G networks
A Presentation on Artificial Intelligence
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
Heart disease approach using modified random forest and particle swarm optimi...
Machine Learning_overview_presentation.pptx
August Patch Tuesday
Network Security Unit 5.pdf for BCA BBA.

Da molin databases_ecn_2012

  • 1. Biodiversity DataBiodiversity Data vs. the Web 2.0vs. the Web 2.0 OR How I learned to stop worrying and love the “systems” Ana Dal Molin J. B. Woolley Texas A&M University
  • 3. [ Why this talk ]
  • 4. • Data providers • Aggregators • Tools • etc “growth in bioinformatics data exceeded Moore’s Law, the well- known observation that the number of transistors on a chip doubles every 18 months.” (Butte, 2001, TRENDS in Biotechnology 19(5)) • Johnson, N. 2007. Annual Rev. Entomology • http://www.ala.org.au/about-the-atlas/downloadable-tools/tools-review/ • IDigBio 47*
  • 5. [ what do I use? ]
  • 6. • Museums often have already decided on a model/database system • Each researcher, on the other hand, may not have, so questions – Content management systems (CMS)? – Which output? – Stability? – Best practices?
  • 7. ‘systems’ available • First Generation: desktop-based (MS Access, FileMaker) • Second Generation: desktop-based with web output • Third Generation: content management systems (PHP, Ruby, MySql, etc.)
  • 9. Your data on the ‘net • Reach • Model GBIF species distribution data coverage (2010)
  • 10. [ ? ] Metadata Data Metadata repository Name IndexOccurrence Index Yellow PagesRegional Atlas Annotation Tools Biosecurity Portal Analysis Tools Products LaSalle, 2008. Atlas of Living Australia, ICE2008 presentation
  • 11. [ where do I stand? ]
  • 12. • Taxonomy as 2-natured science • Shifts in media format
  • 13. Web 1.0 -> Web 3.0  1.0: Static HTML, e-mail, forums, chat  2.0: Dynamic HTML, Wikis, blogging, commenting, social networking  3.0: … *You and your work are not invisible before publication*
  • 14. • Web 3.0: – “Social” – Tags – Cloud computing – Ubiquitous connectivity – Open technologies, open data formats (and open identity too) – Publishing in languages specifically designed for data (databases, markup) – Semantic web – Marketing
  • 16. • What the user wants • What you have to deal with * *not done!
  • 18. Books  Gutenberg  Gutenberg Project  WordCat  Hashi Trust
  • 19. The way we collect information is different The way we accumulate information is different The way we understand information is different
  • 20. … or not Jan/2012 33%USA, 20% Brazil, 26% Europe (Germany, Sweden, Spain, Greece, UK)
  • 23. • Web 3.0 1. People lie 2. People are lazy 3. People are stupid 4. Mission: impossible – know thyself 5. Schemas aren’t neutral 6. Metrics influence results 7. There’s more than one way to describe something C. Doctorow, Metacrap, 2001
  • 24. Issues • “Unification”* is not going to happen – curators and researchers will always have their own – (although often largely overlapping) set of crucial information fields which can be cross-linked • These days, it is imperative that databases communicate with each other • ‘unitary taxonomy’ is also not possible and any big database needs to allow the system to display conflicting ideas * Thomas, C. “Biodiversity databases spread, prompting unification call”, Science v. 325 (2009) ** http://hymao.org
  • 25. Data ephemerality • Local vs. Web data ?! Source: Wikipedia, “Science 2.0”
  • 26. Data ephemerality • Digital data preservation: Internet Archive, IIPC • Library of Congress discussions and recommendations – Disclosure, Adoption, Transparency , External dependency, Technical protection • http://www.digitalpreservation.gov/formats
  • 29.  User perspective  “Incomplete” sites  Dynamic information  Selective information?
  • 30. Why I am not a luddite:
  • 31.  Online databases are taxonomic product and marketing for your work  Online biodiversity databases complement your work  But it’s up to you to be able to make the user understand that your work is more than that  The user of online databases is probably not the same as the person who will get your paper
  • 32. summing up • Choose the system based on reports you want/need to deliver
  • 33. … or work with a journal/team that can help you • Make sure the system is flexible enough in your hands • Decide who will do the maintenance of your data – How big is your team? – Fluidity (positive and negative) • Think about stability and backup strategies