SlideShare a Scribd company logo
The Era of Open
Philip E. Bourne
University of California San Diego
pbourne@ucsd.edu
WikiSym+OpenSym Aug 7, 2013 1
The Era of Open Has The
Potential to Deinstitutionalize
WikiSym+OpenSym Aug 7, 2013 2
Daniel Hulshizer/Associated Press
The Era of Open Has The
Potential to Deinstitutionalize
WikiSym+OpenSym Aug 7, 2013 3
Daniel Hulshizer/Associated Press
An Example of That Potential:
The Story of Meredith
WikiSym+OpenSym Aug 7, 2013 4
http://fora.tv/2012/04/20/Congress_Unplugged_Phil_Bourne
The Era of Open Has The
Potential to Deinstitutionalize
WikiSym+OpenSym Aug 7, 2013 5
Daniel Hulshizer/Associated Press
Deinstitutionalization
Vs Conservatism
WikiSym+OpenSym Aug 7, 2013 6
Daniel Hulshizer/Associated Press
It Starts with the Metrics of
Success
[Adapted from Carole Goble]
WikiSym+OpenSym Aug 7, 2013 7
Committee on Academic
Promotions
• What Counts
– Money
– Grants
– Papers
– Teaching
– Service
• What Does Not
– Sharing data
– Sharing software
– Open access
– Collaboration
– Patents
– Startups
WikiSym+OpenSym Aug 7, 2013 8
Getting Ahead as a Computational Biologist in Academia PLOS Comp Biol
The Era of Open Has The
Potential to Deinstitutionalize
WikiSym+OpenSym Aug 7, 2013 9
Daniel Hulshizer/Associated Press
Interim Solution:
Use the Traditional Reward System
The Wikipedia Experiment – Topic Pages
 Identify areas of Wikipedia that
relate to the journal that are
missing of stubs
 Develop a Wikipedia page in the
sandbox
 Have a Topic Page Editor Review
the page
 Publish the copy of record with
associated rewards
 Release the living version into
Wikipedia
WikiSym+OpenSym Aug 7, 2013 10
MOOCs Are Another Form of
Disruption
WikiSym+OpenSym Aug 7, 2013 11
In Short Most Academic
Institutions Have Yet to
Embrace the Open Digital
Enterprise They Surely Will
Become
WikiSym+OpenSym Aug 7, 2013 12
• Anyone, anything,
anytime
• publication access, data,
models, source codes,
resources, transparent
methods, standards,
formats, identifiers, apis,
licenses, education,
policies
• “accessible, intelligible,
assessable, reusable”
http://royalsociety.org/policy/projects/science-public-enterprise/report/
[Carole Goble]
WikiSym+OpenSym Aug 7, 2013 13
Business Models Rule
• The Internet demanded new business models to
support scholarly communication
• Open access was one such sustainable model:
– Began with the community
– Was driven by new organizations (PLOS, BMC,
F1000, eLife, Dryad, Mendeley etc.)
– Was NOT driven by academic institutions
– Was driven by policies and funders
WikiSym+OpenSym Aug 7, 2013 14
One Metric of Change:
Multidisciplinary Open Access
Mega Journal
• This year PLOS ONE
will publish over
30,000 papers!
WikiSym+OpenSym Aug 7, 2013 15
This Disruption Got Us
Thinking About…
• A paper as only one form of knowledge
discovery
• The use of interaction and rich media from
which to learn and actually do science
• Reproducibility
• Reward structures
• Better management of the research lifecycle
P.E. Bourne 2005 In the Future will a Biological Database Really be Different
from a Biological Journal? PLOS Comp. Biol. 1(3) e34
WikiSym+OpenSym Aug 7, 2013 16
This Disruption Got Us
Thinking About…
• A paper as only one form of knowledge
discovery
• The use of interaction and rich media from
which to learn and actually do science
• Reproducibility
• Reward structures
• Better management of the research lifecycle
P.E. Bourne 2005 In the Future will a Biological Database Really be Different
from a Biological Journal? PLOS Comp. Biol. 1(3) e34
WikiSym+OpenSym Aug 7, 2013 17
Better Management of the
Research Lifecycle is Not a
New Concept
WikiSym+OpenSym Aug 7, 2013 18
“An article about
computational science in a
scientific publication is not the
scholarship itself, it is merely
advertising of the scholarship.
The actual scholarship is the
complete software
development environment,
[the complete data] and the
complete set of instructions
which generated the figures.”
David Donoho, “Wavelab and
Reproducible Research,” 1995
datasets
data collections
algorithms
configurations
tools and apps
codes
workflows
scripts
code libraries
services,
system software
infrastructure,
compilers
hardware
Morin et al Shining Light into Black Boxes
Science 13 April 2012: 336(6078) 159-160
Ince et al The case for open computer
programs, Nature 482, 2012
[Carole Goble]
The Research Lifecycle
IDEAS – HYPOTHESES – EXPERIMENTS – DATA - ANALYSIS - COMPREHENSION - DISSEMINATION
Authoring
Tools
Lab
Notebooks
Data
Capture
Software
Repositories
Analysis
Tools
Visualization
Scholarly
Communication
Commercial &
Public Tools
Git-like
Resources
By Discipline
Data Journals
Discipline-
Based Metadata
Standards
Community Portals
Institutional Repositories
New Reward
Systems
Commercial Repositories
Training
The Research Lifecycle
IDEAS – HYPOTHESES – EXPERIMENTS – DATA - ANALYSIS - COMPREHENSION - DISSEMINATION
Authoring
Tools
Lab
Notebooks
Data
Capture
Software
Repositories
Analysis
Tools
Visualization
Scholarly
Communication
Commercial &
Public Tools
Git-like
Resources
By Discipline
Data Journals
Discipline-
Based Metadata
Standards
Community Portals
Institutional Repositories
New Reward
Systems
Commercial Repositories
Training
automate: workflows, pipeline
& service integrative
frameworks
pool, share & collaborate
web systems
nanopub
semantics & ontologies
machine readable documentation
scientific software
engineering
CS
SE
Carole Goble]
Why is This Important to Me
Personally?
• My wife is being treated for stage 1 breast
cancer
• This highlights for me the disparity
between what is happening in the lab and
what is happening in the clinic
– In the lab cancer is a personalized and
treatable condition
– In the clinic we are still equally “poisoning”
patients with drugs first introduced 10-20
years ago WikiSym+OpenSym Aug 7, 2013 23
http://sagecongress.org/Presentations/Sommer.pdf
WikiSym+OpenSym Aug 7, 2013 24
Josh Sommer]
http://sagecongress.org/Presentations/Sommer.pdf
WikiSym+OpenSym Aug 7, 2013 25
[Josh Sommer]
Most Laboratories
• We are the long tail
• Goodbye to the student is
goodbye to the data
• Very few of us have
complied (or will comply
with the data
management plans we
write into grants)
• Too much software is
unusable
S.Veretnik, J.L.Fink, and P.E. Bourne 2008 Computational Biology Resources Lack
Persistence and Usability. PLoS Comp. Biol. . 4(7): e1000136
WikiSym+OpenSym Aug 7, 2013 26
Today’s Research Lifecycle is
Digitally Fragmented at Best
• Proof:
– I cant immediately reproduce the research in
my own laboratory
• It took an estimated 280 hours for an average user
to approximately reproduce the paper
– Workflows are maturing and becoming helpful
– Data and software versions and accessibility
prevent exact reproducability
Daniel Garijo et al. 2013 Quantifying Reproducibility in Computational Biology:
The Case of the Tuberculosis Drugome PLOS ONE under review.
WikiSym+OpenSym Aug 7, 2013 27
At the Same Time The
Disruption Continues
WikiSym+OpenSym Aug 7, 2013 28
G8 open data charter
http://opensource.com/government/13/7/open-data-charter-g8
WikiSym+OpenSym Aug 7, 2013 29
• In the US alone..
– March 2012 OSTP
commits $200M to Big
Data
– OSTP demands
sharing plans by
August 2013
– GBMF/Sloan provide
institutional awards for
data science
– NCBI considers data
catalog and
MyBibliography
And the Disruption Continues
WikiSym+OpenSym Aug 7, 2013 30
Where Will It End?
First We Should Ask What It Is
We Wish to Accomplish
WikiSym+OpenSym Aug 7, 2013 31
1. A link brings up figures
from the paper
0. Full text of PLoS papers stored
in a database
2. Clicking the paper figure retrieves
data from the PDB which is
analyzed
3. A composite view of
journal and database
content results
Here is What I Want – The Paper
As Experiment
1. User clicks on thumbnail
2. Metadata and a
webservices call provide
a renderable image that
can be annotated
3. Selecting a features
provides a
database/literature
mashup
4. That leads to new
papers
4. The composite view has
links to pertinent blocks
of literature text and back to the PDB
1.
2.
3.
4.
PLoS Comp. Biol. 2005 1(3) e34
32
Here is What I Want –
Knowledge Push
• Each evening the labs “Evernote”
notebooks are scanned for commonalities
from the days activities. These are seeds
in a deep search of the webs research
lifecycles that has become available since
last searched. Results are ranked and
presented for consideration over coffee
the next morning
http://www.discoveryinformaticsinitiative.org/diw2012
WikiSym+OpenSym Aug 7, 2013 33
Will End With …
• Infrastructure:
– Science, Nature, Cell and megajournals all
“open access”
– An array of coupled institutional repositories
– A central repository – PubMed Central
– Open software in full support of the research
lifecycle
– The research lifecycle in the cloud
WikiSym+OpenSym Aug 7, 2013 34
Will End With …
• Sociologically:
– An end to build it and they will come
– Alternative metrics accepted by the
community
– Alternative reward systems that recognize the
realities of today’s scholarship, namely:
• Open data availability
• Software availability
• Collaborative research
WikiSym+OpenSym Aug 7, 2013 35
We Have a Way to Go
Consider the Life Sciences
• Good News
– We have NCBI/EBI
– Publishers are starting
to embrace data
– Workflows in support
of the research
lifecycle are catching
on
• Bad News
– Sustainability remains
a noun not a verb
– Data are organized by
type not by questions
asked (silos)
– Tenure committees
are still in the dark
ages
WikiSym+OpenSym Aug 7, 2013 36
What Can We Do As a
Community?
WikiSym+OpenSym Aug 7, 2013 37
Build Trust
38
Data
Trust in the data
and the derived
knowledge
WikiSym+OpenSym Aug 7, 2013
What I Have Learned About
Trust 1/2
• Trust is like compound interest
• Comes from listening
• Comes from engaging the community in
every aspect of the process
• Comes from data consistency and level of
annotation
• Comes from responsiveness
• Comes from the quality of the delivery
service 39WikiSym+OpenSym Aug 7, 2013
What I Have Learned About
Trust 2/2
• Quality begats trust
– Quality requires data models/ontologies
• Quality requires people
– Annotators are the unsung heroes
• Trust requires provenance & versioning
• Trust requires explaining that all data and
knowledge are not created equal
40WikiSym+OpenSym Aug 7, 2013
Beyond Building Trust What
Else Can We Do?
WikiSym+OpenSym Aug 7, 2013 41
Think Globally Act Locally
• Support emergent community commons/portals
• Be involved in the support and development of
metadata standards
• Contribute to workflow development etc. to drive
an open research lifecycle
• Educate your mentors on the importance of
open science and scholarly communication
• Write software thinking of an App model
WikiSym+OpenSym Aug 7, 2013 42
Understand That All
Data/Knowledge Are NOT
Created Equal
• We need to understand
how data are used
• Sustainability is not
more money from the
funding agencies its
about business models
• Reductionism is not a
dirty word
• We need to do more
with the long tailOn the Future of Genomic Data
Science 11 February 2011:
vol. 331 no. 6018 728-729 WikiSym+OpenSym Aug 7, 2013
Recognize That Institutions
Must Play a Greater Role
• We need institutional data/knowledge
sharing plans
• We need data/information scientists to be
better recognized by institutions – its not
all about papers – this implies new metrics
44WikiSym+OpenSym Aug 7, 2013
Learn from the App Store
• The App model
– Think of it operating on a content base
rather than a mobile device
– Simple and consistent user interface
– Needs to pass some quality control
– Has a reward
• The App+ Model
– Apps interoperate through a generic
workflow interface
WikiSym+OpenSym Aug 7, 2013 45
In Summary
• Open science is a means to accelerate
the rate of discovery
• Disruption has begun, but there is great
inertia in the system
• All of us are stakeholders and capable of
invoking further positive change
• We need to get institutions and more
scientists involved….
WikiSym+OpenSym Aug 7, 2013 46
Acknowledgements
www.force11.org
WikiSym+OpenSym Aug 7, 2013 47
pbourne@ucsd.edu
• Force11 Manifesto
• Fourth Paradigm: Data Intensive Scientific
Discovery
http://research.microsoft.com/enus/collabora
tion/fourthparadigm/WikiSym+OpenSym Aug 7, 2013 48

More Related Content

PPTX
Principles and practice of Open Science
PPTX
Copyright Reform and Open Data
PPTX
Open data and Open Science
PPTX
Reward, reproducibility and recognition in research - the case for going Open
PPTX
The purpose, practicalities, pitfalls and policies of managing and sharing da...
PDF
"Building Capacity for Open Research" - AAMC
PPTX
Open Data and Open Science
PPTX
Is ‘Open Science’ a solution or a threat?
Principles and practice of Open Science
Copyright Reform and Open Data
Open data and Open Science
Reward, reproducibility and recognition in research - the case for going Open
The purpose, practicalities, pitfalls and policies of managing and sharing da...
"Building Capacity for Open Research" - AAMC
Open Data and Open Science
Is ‘Open Science’ a solution or a threat?

What's hot (20)

PPTX
Rapid biomedical search
PPT
Scott Edmunds at OASP Asia: Open (and Big) Data – the next challenge
PPTX
Open Notebook Science
PPTX
The value of embracing unknown unknowns
PPTX
Learn to speak open
PPTX
Big Data for the Social Sciences - David De Roure - Jisc Digital Festival 2014
PPTX
Digital Scholarship
PDF
Why we care about research data? Why we share?
PPTX
Leveraging the ETD as a pathway to broader discussions about openness in a un...
PPTX
The wider environment of open scholarship – Jisc and CNI conference 10 July ...
PPTX
Be careful what you wish for - unexpected policy consequences
PPT
Visual Data Analytics in the Cloud for Exploratory Science
PDF
Open Access Now!
PDF
The world of research data: when should data be closed, shared or open
PPTX
The Content Mine (presented at UKSG)
PPTX
Wikipedia l
PDF
Research Life Cycle for GeoData 2014
PPT
Openness and change
PPT
End-to-End eScience
PPTX
Academic Social Network Sites: a rough guide for researchers
Rapid biomedical search
Scott Edmunds at OASP Asia: Open (and Big) Data – the next challenge
Open Notebook Science
The value of embracing unknown unknowns
Learn to speak open
Big Data for the Social Sciences - David De Roure - Jisc Digital Festival 2014
Digital Scholarship
Why we care about research data? Why we share?
Leveraging the ETD as a pathway to broader discussions about openness in a un...
The wider environment of open scholarship – Jisc and CNI conference 10 July ...
Be careful what you wish for - unexpected policy consequences
Visual Data Analytics in the Cloud for Exploratory Science
Open Access Now!
The world of research data: when should data be closed, shared or open
The Content Mine (presented at UKSG)
Wikipedia l
Research Life Cycle for GeoData 2014
Openness and change
End-to-End eScience
Academic Social Network Sites: a rough guide for researchers
Ad

Viewers also liked (6)

PDF
An Introduction to Force11 at WWW2013
PPT
Ngsp
PPTX
Open Access and Research Communication: The Perspective of Force11
PPT
Overview of Digital Publishing
PDF
A Clean Slate?
PPT
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
An Introduction to Force11 at WWW2013
Ngsp
Open Access and Research Communication: The Perspective of Force11
Overview of Digital Publishing
A Clean Slate?
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
Ad

Similar to The Era of Open (20)

PPTX
Is a Biological Database Really Different than a Biological Journal?
PPT
Open Data in a Big Data World: easy to say, but hard to do?
PDF
The State of Open Data Report by @figshare
PDF
The OpenCon Intro to Open Data
PPTX
Data are the new black : Susan Robbins
PPT
Presentation of science 2.0 at European Astronomical Society
PPTX
A Revolution in Open Science: Open Data and the Role of Libraries (Professor ...
PDF
CODATA International Training Workshop in Big Data for Science for Researcher...
PPT
Biomedical Research as an Open Digital Enterprise
PPT
Science20brussels osimo april2013
PDF
Open Data - strategies for research data management & impact of best practices
PPTX
Data Science and Urban Science @ UW
PPTX
Open licensing and academic research - 9th april 2014
PPTX
Open science and its advocacy
PPTX
What Bioinformaticians Need to Know About Digital Publishing Beyond the PDF
PPT
What is the role of Open Access and Open Educational Resources within Distanc...
PPTX
Open Educational Practice for Colloque International Montreal 2014
PPT
Open Data in a Global Ecosystem
PDF
Being an Open Scholar in a Connected World
PPT
How to Execute A Research Paper
Is a Biological Database Really Different than a Biological Journal?
Open Data in a Big Data World: easy to say, but hard to do?
The State of Open Data Report by @figshare
The OpenCon Intro to Open Data
Data are the new black : Susan Robbins
Presentation of science 2.0 at European Astronomical Society
A Revolution in Open Science: Open Data and the Role of Libraries (Professor ...
CODATA International Training Workshop in Big Data for Science for Researcher...
Biomedical Research as an Open Digital Enterprise
Science20brussels osimo april2013
Open Data - strategies for research data management & impact of best practices
Data Science and Urban Science @ UW
Open licensing and academic research - 9th april 2014
Open science and its advocacy
What Bioinformaticians Need to Know About Digital Publishing Beyond the PDF
What is the role of Open Access and Open Educational Resources within Distanc...
Open Educational Practice for Colloque International Montreal 2014
Open Data in a Global Ecosystem
Being an Open Scholar in a Connected World
How to Execute A Research Paper

More from Philip Bourne (20)

PPTX
Your Science Needs You - More Than Ever Before
PPTX
The Biological Data Sustainability Paradox: A Time to Think Differently
PPTX
Data Science and AI in Biomedicine: The World has Changed
PPTX
Data Science and AI in Biomedicine: The World has Changed
PPTX
AI in Medical Education A Meta View to Start a Conversation
PPTX
AI+ Now and Then How Did We Get Here And Where Are We Going
PPTX
Thoughts on Biological Data Sustainability
PPTX
What is FAIR Data and Who Needs It?
PPTX
Data Science Meets Biomedicine, Does Anything Change
PPTX
Data Science Meets Drug Discovery
PPTX
Biomedical Data Science: We Are Not Alone
PPTX
BIMS7100-2023. Social Responsibility in Research
PPTX
AI from the Perspective of a School of Data Science
PPTX
What Data Science Will Mean to You - One Person's View
PPTX
Novo Nordisk 080522.pptx
PPTX
Towards a US Open research Commons (ORC)
PPTX
COVID and Precision Education
PPTX
One View of Data Science
PPTX
Cancer Research Meets Data Science — What Can We Do Together?
PPTX
Data Science Meets Open Scholarship – What Comes Next?
Your Science Needs You - More Than Ever Before
The Biological Data Sustainability Paradox: A Time to Think Differently
Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has Changed
AI in Medical Education A Meta View to Start a Conversation
AI+ Now and Then How Did We Get Here And Where Are We Going
Thoughts on Biological Data Sustainability
What is FAIR Data and Who Needs It?
Data Science Meets Biomedicine, Does Anything Change
Data Science Meets Drug Discovery
Biomedical Data Science: We Are Not Alone
BIMS7100-2023. Social Responsibility in Research
AI from the Perspective of a School of Data Science
What Data Science Will Mean to You - One Person's View
Novo Nordisk 080522.pptx
Towards a US Open research Commons (ORC)
COVID and Precision Education
One View of Data Science
Cancer Research Meets Data Science — What Can We Do Together?
Data Science Meets Open Scholarship – What Comes Next?

Recently uploaded (20)

PDF
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
PDF
Classroom Observation Tools for Teachers
PDF
Complications of Minimal Access Surgery at WLH
PDF
RMMM.pdf make it easy to upload and study
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PPTX
Cell Types and Its function , kingdom of life
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PDF
Business Ethics Teaching Materials for college
PDF
Pre independence Education in Inndia.pdf
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PPTX
PPH.pptx obstetrics and gynecology in nursing
PDF
Microbial disease of the cardiovascular and lymphatic systems
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PDF
Basic Mud Logging Guide for educational purpose
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PDF
01-Introduction-to-Information-Management.pdf
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
Classroom Observation Tools for Teachers
Complications of Minimal Access Surgery at WLH
RMMM.pdf make it easy to upload and study
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
102 student loan defaulters named and shamed – Is someone you know on the list?
Cell Types and Its function , kingdom of life
Supply Chain Operations Speaking Notes -ICLT Program
Business Ethics Teaching Materials for college
Pre independence Education in Inndia.pdf
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
FourierSeries-QuestionsWithAnswers(Part-A).pdf
Final Presentation General Medicine 03-08-2024.pptx
PPH.pptx obstetrics and gynecology in nursing
Microbial disease of the cardiovascular and lymphatic systems
Microbial diseases, their pathogenesis and prophylaxis
Basic Mud Logging Guide for educational purpose
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
01-Introduction-to-Information-Management.pdf
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...

The Era of Open

  • 1. The Era of Open Philip E. Bourne University of California San Diego pbourne@ucsd.edu WikiSym+OpenSym Aug 7, 2013 1
  • 2. The Era of Open Has The Potential to Deinstitutionalize WikiSym+OpenSym Aug 7, 2013 2 Daniel Hulshizer/Associated Press
  • 3. The Era of Open Has The Potential to Deinstitutionalize WikiSym+OpenSym Aug 7, 2013 3 Daniel Hulshizer/Associated Press
  • 4. An Example of That Potential: The Story of Meredith WikiSym+OpenSym Aug 7, 2013 4 http://fora.tv/2012/04/20/Congress_Unplugged_Phil_Bourne
  • 5. The Era of Open Has The Potential to Deinstitutionalize WikiSym+OpenSym Aug 7, 2013 5 Daniel Hulshizer/Associated Press
  • 6. Deinstitutionalization Vs Conservatism WikiSym+OpenSym Aug 7, 2013 6 Daniel Hulshizer/Associated Press
  • 7. It Starts with the Metrics of Success [Adapted from Carole Goble] WikiSym+OpenSym Aug 7, 2013 7
  • 8. Committee on Academic Promotions • What Counts – Money – Grants – Papers – Teaching – Service • What Does Not – Sharing data – Sharing software – Open access – Collaboration – Patents – Startups WikiSym+OpenSym Aug 7, 2013 8 Getting Ahead as a Computational Biologist in Academia PLOS Comp Biol
  • 9. The Era of Open Has The Potential to Deinstitutionalize WikiSym+OpenSym Aug 7, 2013 9 Daniel Hulshizer/Associated Press
  • 10. Interim Solution: Use the Traditional Reward System The Wikipedia Experiment – Topic Pages  Identify areas of Wikipedia that relate to the journal that are missing of stubs  Develop a Wikipedia page in the sandbox  Have a Topic Page Editor Review the page  Publish the copy of record with associated rewards  Release the living version into Wikipedia WikiSym+OpenSym Aug 7, 2013 10
  • 11. MOOCs Are Another Form of Disruption WikiSym+OpenSym Aug 7, 2013 11
  • 12. In Short Most Academic Institutions Have Yet to Embrace the Open Digital Enterprise They Surely Will Become WikiSym+OpenSym Aug 7, 2013 12
  • 13. • Anyone, anything, anytime • publication access, data, models, source codes, resources, transparent methods, standards, formats, identifiers, apis, licenses, education, policies • “accessible, intelligible, assessable, reusable” http://royalsociety.org/policy/projects/science-public-enterprise/report/ [Carole Goble] WikiSym+OpenSym Aug 7, 2013 13
  • 14. Business Models Rule • The Internet demanded new business models to support scholarly communication • Open access was one such sustainable model: – Began with the community – Was driven by new organizations (PLOS, BMC, F1000, eLife, Dryad, Mendeley etc.) – Was NOT driven by academic institutions – Was driven by policies and funders WikiSym+OpenSym Aug 7, 2013 14
  • 15. One Metric of Change: Multidisciplinary Open Access Mega Journal • This year PLOS ONE will publish over 30,000 papers! WikiSym+OpenSym Aug 7, 2013 15
  • 16. This Disruption Got Us Thinking About… • A paper as only one form of knowledge discovery • The use of interaction and rich media from which to learn and actually do science • Reproducibility • Reward structures • Better management of the research lifecycle P.E. Bourne 2005 In the Future will a Biological Database Really be Different from a Biological Journal? PLOS Comp. Biol. 1(3) e34 WikiSym+OpenSym Aug 7, 2013 16
  • 17. This Disruption Got Us Thinking About… • A paper as only one form of knowledge discovery • The use of interaction and rich media from which to learn and actually do science • Reproducibility • Reward structures • Better management of the research lifecycle P.E. Bourne 2005 In the Future will a Biological Database Really be Different from a Biological Journal? PLOS Comp. Biol. 1(3) e34 WikiSym+OpenSym Aug 7, 2013 17
  • 18. Better Management of the Research Lifecycle is Not a New Concept WikiSym+OpenSym Aug 7, 2013 18
  • 19. “An article about computational science in a scientific publication is not the scholarship itself, it is merely advertising of the scholarship. The actual scholarship is the complete software development environment, [the complete data] and the complete set of instructions which generated the figures.” David Donoho, “Wavelab and Reproducible Research,” 1995 datasets data collections algorithms configurations tools and apps codes workflows scripts code libraries services, system software infrastructure, compilers hardware Morin et al Shining Light into Black Boxes Science 13 April 2012: 336(6078) 159-160 Ince et al The case for open computer programs, Nature 482, 2012 [Carole Goble]
  • 20. The Research Lifecycle IDEAS – HYPOTHESES – EXPERIMENTS – DATA - ANALYSIS - COMPREHENSION - DISSEMINATION Authoring Tools Lab Notebooks Data Capture Software Repositories Analysis Tools Visualization Scholarly Communication Commercial & Public Tools Git-like Resources By Discipline Data Journals Discipline- Based Metadata Standards Community Portals Institutional Repositories New Reward Systems Commercial Repositories Training
  • 21. The Research Lifecycle IDEAS – HYPOTHESES – EXPERIMENTS – DATA - ANALYSIS - COMPREHENSION - DISSEMINATION Authoring Tools Lab Notebooks Data Capture Software Repositories Analysis Tools Visualization Scholarly Communication Commercial & Public Tools Git-like Resources By Discipline Data Journals Discipline- Based Metadata Standards Community Portals Institutional Repositories New Reward Systems Commercial Repositories Training
  • 22. automate: workflows, pipeline & service integrative frameworks pool, share & collaborate web systems nanopub semantics & ontologies machine readable documentation scientific software engineering CS SE Carole Goble]
  • 23. Why is This Important to Me Personally? • My wife is being treated for stage 1 breast cancer • This highlights for me the disparity between what is happening in the lab and what is happening in the clinic – In the lab cancer is a personalized and treatable condition – In the clinic we are still equally “poisoning” patients with drugs first introduced 10-20 years ago WikiSym+OpenSym Aug 7, 2013 23
  • 26. Most Laboratories • We are the long tail • Goodbye to the student is goodbye to the data • Very few of us have complied (or will comply with the data management plans we write into grants) • Too much software is unusable S.Veretnik, J.L.Fink, and P.E. Bourne 2008 Computational Biology Resources Lack Persistence and Usability. PLoS Comp. Biol. . 4(7): e1000136 WikiSym+OpenSym Aug 7, 2013 26
  • 27. Today’s Research Lifecycle is Digitally Fragmented at Best • Proof: – I cant immediately reproduce the research in my own laboratory • It took an estimated 280 hours for an average user to approximately reproduce the paper – Workflows are maturing and becoming helpful – Data and software versions and accessibility prevent exact reproducability Daniel Garijo et al. 2013 Quantifying Reproducibility in Computational Biology: The Case of the Tuberculosis Drugome PLOS ONE under review. WikiSym+OpenSym Aug 7, 2013 27
  • 28. At the Same Time The Disruption Continues WikiSym+OpenSym Aug 7, 2013 28
  • 29. G8 open data charter http://opensource.com/government/13/7/open-data-charter-g8 WikiSym+OpenSym Aug 7, 2013 29
  • 30. • In the US alone.. – March 2012 OSTP commits $200M to Big Data – OSTP demands sharing plans by August 2013 – GBMF/Sloan provide institutional awards for data science – NCBI considers data catalog and MyBibliography And the Disruption Continues WikiSym+OpenSym Aug 7, 2013 30
  • 31. Where Will It End? First We Should Ask What It Is We Wish to Accomplish WikiSym+OpenSym Aug 7, 2013 31
  • 32. 1. A link brings up figures from the paper 0. Full text of PLoS papers stored in a database 2. Clicking the paper figure retrieves data from the PDB which is analyzed 3. A composite view of journal and database content results Here is What I Want – The Paper As Experiment 1. User clicks on thumbnail 2. Metadata and a webservices call provide a renderable image that can be annotated 3. Selecting a features provides a database/literature mashup 4. That leads to new papers 4. The composite view has links to pertinent blocks of literature text and back to the PDB 1. 2. 3. 4. PLoS Comp. Biol. 2005 1(3) e34 32
  • 33. Here is What I Want – Knowledge Push • Each evening the labs “Evernote” notebooks are scanned for commonalities from the days activities. These are seeds in a deep search of the webs research lifecycles that has become available since last searched. Results are ranked and presented for consideration over coffee the next morning http://www.discoveryinformaticsinitiative.org/diw2012 WikiSym+OpenSym Aug 7, 2013 33
  • 34. Will End With … • Infrastructure: – Science, Nature, Cell and megajournals all “open access” – An array of coupled institutional repositories – A central repository – PubMed Central – Open software in full support of the research lifecycle – The research lifecycle in the cloud WikiSym+OpenSym Aug 7, 2013 34
  • 35. Will End With … • Sociologically: – An end to build it and they will come – Alternative metrics accepted by the community – Alternative reward systems that recognize the realities of today’s scholarship, namely: • Open data availability • Software availability • Collaborative research WikiSym+OpenSym Aug 7, 2013 35
  • 36. We Have a Way to Go Consider the Life Sciences • Good News – We have NCBI/EBI – Publishers are starting to embrace data – Workflows in support of the research lifecycle are catching on • Bad News – Sustainability remains a noun not a verb – Data are organized by type not by questions asked (silos) – Tenure committees are still in the dark ages WikiSym+OpenSym Aug 7, 2013 36
  • 37. What Can We Do As a Community? WikiSym+OpenSym Aug 7, 2013 37
  • 38. Build Trust 38 Data Trust in the data and the derived knowledge WikiSym+OpenSym Aug 7, 2013
  • 39. What I Have Learned About Trust 1/2 • Trust is like compound interest • Comes from listening • Comes from engaging the community in every aspect of the process • Comes from data consistency and level of annotation • Comes from responsiveness • Comes from the quality of the delivery service 39WikiSym+OpenSym Aug 7, 2013
  • 40. What I Have Learned About Trust 2/2 • Quality begats trust – Quality requires data models/ontologies • Quality requires people – Annotators are the unsung heroes • Trust requires provenance & versioning • Trust requires explaining that all data and knowledge are not created equal 40WikiSym+OpenSym Aug 7, 2013
  • 41. Beyond Building Trust What Else Can We Do? WikiSym+OpenSym Aug 7, 2013 41
  • 42. Think Globally Act Locally • Support emergent community commons/portals • Be involved in the support and development of metadata standards • Contribute to workflow development etc. to drive an open research lifecycle • Educate your mentors on the importance of open science and scholarly communication • Write software thinking of an App model WikiSym+OpenSym Aug 7, 2013 42
  • 43. Understand That All Data/Knowledge Are NOT Created Equal • We need to understand how data are used • Sustainability is not more money from the funding agencies its about business models • Reductionism is not a dirty word • We need to do more with the long tailOn the Future of Genomic Data Science 11 February 2011: vol. 331 no. 6018 728-729 WikiSym+OpenSym Aug 7, 2013
  • 44. Recognize That Institutions Must Play a Greater Role • We need institutional data/knowledge sharing plans • We need data/information scientists to be better recognized by institutions – its not all about papers – this implies new metrics 44WikiSym+OpenSym Aug 7, 2013
  • 45. Learn from the App Store • The App model – Think of it operating on a content base rather than a mobile device – Simple and consistent user interface – Needs to pass some quality control – Has a reward • The App+ Model – Apps interoperate through a generic workflow interface WikiSym+OpenSym Aug 7, 2013 45
  • 46. In Summary • Open science is a means to accelerate the rate of discovery • Disruption has begun, but there is great inertia in the system • All of us are stakeholders and capable of invoking further positive change • We need to get institutions and more scientists involved…. WikiSym+OpenSym Aug 7, 2013 46
  • 48. pbourne@ucsd.edu • Force11 Manifesto • Fourth Paradigm: Data Intensive Scientific Discovery http://research.microsoft.com/enus/collabora tion/fourthparadigm/WikiSym+OpenSym Aug 7, 2013 48

Editor's Notes

  • #8: I bought the rights to this image
  • #14: “ if it isn’t open it isn’t science” Mike Ashburner
  • #23: In May myExperiment   * 14,660 page views  * 3,076 unique visitors  * 67% new visitors, 33% returning visitors