SlideShare a Scribd company logo
The world of
research data:
when should
data be closed,
shared or open
Dr Heila Pienaar
Deputy Director: Strategic Innovation
UP Library Services
Content
– Physical characteristics of research data before it can be shared
– Modes of data sharing
– Case study: public humiliation in the name of Open Science
– Advantages and disadvantages of sharing research data
– AI to the rescue of open research articles?
– In conclusion
Characteristics of research data
before it can be shared
– Not all research data are digitally born
– Research data must be digital in order to be able to preserve & share it (this is
often not the case)
– Research data must be uploaded to a digital system / database / repository to
make general sharing possible
– Research data must be packaged in a format for ease of download and use
(meta-data; provenance)
– Analysis software programmes should be provided with the research data
(with instructions)
https://www.youtube.com/watch?v=N2zK3sAtr-4
Modes of data sharing
– Once data is collected and analysed, it can be released in a variety
of ways:
– closed networks within an organisation or project
– to researchers working on the same topic (‘invisible colleges’)
– to platform-dependent sharing between peer organisations
– to publishing with closed licenses
– to publishing openly but with permission
– to publishing with fully open licenses
Case study: public humiliation in
the name of Open Science
– How freely should scientists share their data – a case study by Daniel Barron on
August 13, 2018 https://www.natureindex.com/news-blog/how-freely-should-
scientists-share-their-data
– Jack Gallant is a cognitive neuroscientist at the University of California,
Berkeley
– In 2011 he showed that he could—based only on measures of brain activity—
actually reconstruct images of movies people were watching
– In 2016 he showed what listening to the Moth podcast does to our brains. His
analysis of the Moth podcast was published in Nature (Moth podcast – true
stories told live)
http://gallantlab.org/index.php/publications/nishimoto-et-al-20 11/
https://www.youtube.com/watch?v=k61nJkx5aDQ&feature=youtu.be
Natural speech reveals the semantic maps that tile human cerebral cortex
Publicly humiliated on Twitter
– Manilo De Domenico a theoretical physicist, tweeted, “We keep trying to ask
access to data used in your nature 2016, but we received not a single reply, yet.”
– Gallant replied. “The original authors are still writing further primary research
papers on these data so they haven't been released yet but we expect to be
able to do that very soon.”
– “‘We still want exclusivity to publish more papers’ isn’t a great excuse. Did you
note data restrictions in the manuscript?” tweeted Andre Brown, referring to
Nature’s policy that, on publication, authors should make their data, code and
protocols “promptly” and publicly available
• Domenico lamented that Gallant’s paper had given him a
series of ideas that he wanted to test but couldn’t
because he needed Gallant’s data, “This is not advancing
human knowledge,” de Domenico asserted.
• Gallant dug in: “And why do you assume that your project
is better than the ones that we are continuing with these
data? My students and postdocs are an awesome group
of people, the stuff they have in the pipeline is great! But
I can’t afford for them to be scooped.”
Barron (blog writer) questions
the ideals of Open Science
– A highly-productive lab writes a grant to fund a series of studies and the
development of new tools. They spend years collecting data and building the tools
for these proposed studies
– Then, they finish a portion of the project and begin to publish results. Should they
be required to release their data to the community? If so, when? Who owns that
data? And what business do journals have in enforcing data sharing?
– Barron is questioning the role of publishers and journals in the open data debate – is
it their role to force the sharing of research data? Barron remains unconvinced that
there is an immediacy to sharing most forms of scientific data—especially an
immediacy in the name of the public interest
– I am convinced that other scientists feel an immediate need to analyse data sets
that they do not own—especially if the results of a particularly excellent data set
can be published in Nature and make them famous
I also heard cautionary tales that the Open Science
movement had a dark side, that “openness” had, at
times, devolved into bullying and theft. Some
compared the Open Science movement to
Communism: good in principle, impossible in
practice. In informal settings — at dinner, over
drinks — I was reminded that science was a
competitive business
Advantages & disadvantages of
sharing research data
– It seems that there are two types of research results that should be shared as
soon as it is possible, i.e. clinical trial data and rare-diseases research.
– A framework for responsible sharing of clinical trial data has been developed in
order to maximize benefits to participants in clinical trials and to society, while
minimizing harm. https://www.ncbi.nlm.nih.gov/books/NBK253390/
– Rare-disease families’ access to medical research is very problematic. If these
families can’t afford subscription fees they have to navigate a variety of
gatekeeping mechanisms in order to access research that could help them make
critical health decisions. “Yes, everyone should have rainbows, unicorns, &
puppies delivered to their doorstep by volunteers. Y’all keep wishing for that, I’ll
keep working on producing the best knowledge and distributing it as best we
can.” Quote by an Elsevier official – later retracted ….
https://slate.com/technology/2018/08/who-gets-to-read-the-research-
taxpayers-fund.html
Advantages
– To maximize the impact of data (or conclusions drawn from it). Scientists base their research
on foundations laid by other scientists, e.g. scientific theories proven by other scientists and
research data collected by other scientists.
– Sharing data reduces both the cost of data collection and the overall cost of research
– The use of common data bases allows scientists to test and retest their findings against those
of other scientists and promotes progress in science
http://sites.nationalacademies.org/cs/groups/pgasite/documents/webpage/pga_053478.pdf
– Inform collaboration
– Provide stronger evidence for advocacy
– Increase efficiency of service delivery among a wider audience than just within your project
– Play role in decision making within other projects
– Publishing your data can also allow people who might not have otherwise been well-informed
enough about your project, to have a say - for example, those who are reflected in the data
Disadvantages
– Once data is published, it’s impossible to anticipate how it might be shared
further, and once it’s out in the open, there’s no telling how it will be adopted,
re-purposed and re-used for any number of purposes
– Ethical and practical implications, e.g. a group of people could be identified in
spite of anonymising personal information
– Participants seeing that their data is used in a way they don’t agree with, or that
puts them in danger
– Researchers should carefully consider the implications of sharing, whether to
share at all, and the licensing conditions or terms and tools that you can use to
reduce the risk of harm, while still permitting beneficial outcomes
https://responsibledata.io/resources/handbook/chapters/chapter-02c-sharing-
data.html
AI to the rescue of open research
articles, and perhaps open research
data at a later stage?
– Researchers who are busy with specialised research sometimes query the open
access movement as they are of the opinion that few people will understand their
work
– Impactstory to the rescue - Get The Research: Impactstory announces a new
Science-Finding tool for the general public (https://gettheresearch.org)
– It is aimed at serving the general public rather than an audience of scholars and
specialists, and it promises to provide a new level of accessibility to published
scholarship.
– It will be built on the 20 million open access articles in the Unpaywall
(http://unpaywall.org) index, and feature AI-powered tools that help make the
content and context of scholarly articles more clear to readers.
https://scholarlykitchen.sspnet.org/2018/11/12/get-the-research-impactstory-
announces-a-new-science-finding-tool-for-the-general-public/
In conclusion
Sharing of data is highly dependent on the type of
research, the type of data and most importantly, the
requirements of the funder for that specific research.
Open data / Open Science should not be cast in stone, but
should be implemented with great care in order not to
damage the scientific enterprise

More Related Content

PPTX
Nicole Nogoy: GigaScience...how licensing can change the way we do research
PPTX
Data Literacy: Creating and Managing Reserach Data
PPTX
Open sciencerefresher2019
PDF
Paul Allen Open Science
PPT
The Future of Research (Science and Technology)
PPTX
Bioinformatics in the Era of Open Science and Big Data
PPTX
BROWN BAG TALK WITH MICAH ALTMAN INTEGRATING OPEN DATA INTO OPEN ACCESS JOURNALS
PPTX
Benefits and practice of open science
Nicole Nogoy: GigaScience...how licensing can change the way we do research
Data Literacy: Creating and Managing Reserach Data
Open sciencerefresher2019
Paul Allen Open Science
The Future of Research (Science and Technology)
Bioinformatics in the Era of Open Science and Big Data
BROWN BAG TALK WITH MICAH ALTMAN INTEGRATING OPEN DATA INTO OPEN ACCESS JOURNALS
Benefits and practice of open science

What's hot (20)

PPTX
Urban Data Science at UW
PPTX
WikiFactMine: Science for Everyone
PDF
ECSA, the ECSA principles, and the ECSA Characteristics of Citizen Science
PPTX
Data Science, Data Curation, and Human-Data Interaction
PDF
Public data archiving: Who does? Who doesn't? What can we do about it?
PDF
Research Data in the Arts and Humanities: A Few Tricky Questions
PPTX
Data Science and Urban Science @ UW
PPTX
Data, Responsibly: The Next Decade of Data Science
PPTX
Science Data, Responsibly
PDF
Data management (1)
PPTX
A whirlwind tour of Citizen Science in Astronomy
PPTX
20160523 23 Research Data Things
PPTX
Research Data Management Services at UWA (November 2015)
PPTX
20160719 23 Research Data Things
PDF
The OpenCon Intro to Open Data
PPTX
Introduction to Research Data Management at UWA
PPTX
Open Access 101 for OpenCon 2014
PPTX
ContentMine: Mining the Scientific Literature
PPTX
Fsci 2018 monday30_july_am6
PPTX
ContentMining and Copyright at CopyCamp2017
Urban Data Science at UW
WikiFactMine: Science for Everyone
ECSA, the ECSA principles, and the ECSA Characteristics of Citizen Science
Data Science, Data Curation, and Human-Data Interaction
Public data archiving: Who does? Who doesn't? What can we do about it?
Research Data in the Arts and Humanities: A Few Tricky Questions
Data Science and Urban Science @ UW
Data, Responsibly: The Next Decade of Data Science
Science Data, Responsibly
Data management (1)
A whirlwind tour of Citizen Science in Astronomy
20160523 23 Research Data Things
Research Data Management Services at UWA (November 2015)
20160719 23 Research Data Things
The OpenCon Intro to Open Data
Introduction to Research Data Management at UWA
Open Access 101 for OpenCon 2014
ContentMine: Mining the Scientific Literature
Fsci 2018 monday30_july_am6
ContentMining and Copyright at CopyCamp2017
Ad

Similar to The world of research data: when should data be closed, shared or open (20)

PPTX
Open science, open data - FOSTER training, Potsdam
PPTX
Data sharing and data management – what are they all about?
PPTX
Pros and Cons of Open Data: A Global South Perspective
PPTX
So, what's it all about then? Why we share research data
PPTX
The Challenges of Making Data Travel, by Sabina Leonelli
PPT
Incentives for modern research
PDF
The State of Open Data Report by @figshare
PPTX
Open data: Enhancing preservation, reproducibility, and innovation
PPTX
The purpose, practicalities, pitfalls and policies of managing and sharing da...
PPTX
Open Notebook Science
PPTX
Digital Scholarship
PPTX
Digital Scholarship: Enlightenment or Devastated Landscape?
DOCX
1 Do You Speak Open Science Resources and Tips to Lear
PDF
Science as an Open Enterprise – Geoffrey Boulton
PDF
Gradscicomm Day 2
PPTX
Is ‘Open Science’ a solution or a threat?
PDF
Open Science Incentives/Veerle van den Eynden
PDF
Making your research data open
PDF
Making your research data open
PPT
Overview of Emerging Requirements for Data Management of Federally Funded Res...
Open science, open data - FOSTER training, Potsdam
Data sharing and data management – what are they all about?
Pros and Cons of Open Data: A Global South Perspective
So, what's it all about then? Why we share research data
The Challenges of Making Data Travel, by Sabina Leonelli
Incentives for modern research
The State of Open Data Report by @figshare
Open data: Enhancing preservation, reproducibility, and innovation
The purpose, practicalities, pitfalls and policies of managing and sharing da...
Open Notebook Science
Digital Scholarship
Digital Scholarship: Enlightenment or Devastated Landscape?
1 Do You Speak Open Science Resources and Tips to Lear
Science as an Open Enterprise – Geoffrey Boulton
Gradscicomm Day 2
Is ‘Open Science’ a solution or a threat?
Open Science Incentives/Veerle van den Eynden
Making your research data open
Making your research data open
Overview of Emerging Requirements for Data Management of Federally Funded Res...
Ad

More from heila1 (20)

PPTX
Society 5
PPT
Dr heila pienaar cvApril2020
PPTX
Scenario discussion: how do these latest trends impact your library
PPTX
Strategy and implementation plan to create a 21st Century Academic Library
PDF
Trends analysis2015
PPT
Dr heila pienaar cvFeb2019
PPT
Dr heila pienaar cv
PPTX
E safety for kids: curriculum, lessons, resources
PPTX
Building a digital scholarship centre on the successes of a Library Makerspace
PPTX
The role of virtual research environments (VRE's) within the context of an e-...
PPTX
What does it take to become a 4 x 4 librarian? Implementing the Overdirve e-B...
PPTX
Research data management at the University of Pretoria: a case study
PDF
'Makerspaces': should South Africa join the hype?
PPTX
Developing an institutional research management plan: guidelines
PDF
Research data management in a developing country: a personal journey
PPTX
What researchers want with regard to research data management (RDM)
PPTX
Open science / open research
PPTX
Changing research workflows at the University of Pretoria (UP) and the CSIR: ...
PPT
Mobilising a nation: RDM education and training in South Africa
PPTX
Criteria and evaluation of research data repository platforms @ the Universit...
Society 5
Dr heila pienaar cvApril2020
Scenario discussion: how do these latest trends impact your library
Strategy and implementation plan to create a 21st Century Academic Library
Trends analysis2015
Dr heila pienaar cvFeb2019
Dr heila pienaar cv
E safety for kids: curriculum, lessons, resources
Building a digital scholarship centre on the successes of a Library Makerspace
The role of virtual research environments (VRE's) within the context of an e-...
What does it take to become a 4 x 4 librarian? Implementing the Overdirve e-B...
Research data management at the University of Pretoria: a case study
'Makerspaces': should South Africa join the hype?
Developing an institutional research management plan: guidelines
Research data management in a developing country: a personal journey
What researchers want with regard to research data management (RDM)
Open science / open research
Changing research workflows at the University of Pretoria (UP) and the CSIR: ...
Mobilising a nation: RDM education and training in South Africa
Criteria and evaluation of research data repository platforms @ the Universit...

Recently uploaded (20)

PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PDF
Microbial disease of the cardiovascular and lymphatic systems
PDF
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
PPTX
The Healthy Child – Unit II | Child Health Nursing I | B.Sc Nursing 5th Semester
PPTX
Cell Structure & Organelles in detailed.
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PPTX
master seminar digital applications in india
PPTX
Institutional Correction lecture only . . .
PPTX
Week 4 Term 3 Study Techniques revisited.pptx
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PDF
O7-L3 Supply Chain Operations - ICLT Program
PDF
Classroom Observation Tools for Teachers
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PDF
TR - Agricultural Crops Production NC III.pdf
PDF
Anesthesia in Laparoscopic Surgery in India
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
Renaissance Architecture: A Journey from Faith to Humanism
Microbial disease of the cardiovascular and lymphatic systems
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
The Healthy Child – Unit II | Child Health Nursing I | B.Sc Nursing 5th Semester
Cell Structure & Organelles in detailed.
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
Final Presentation General Medicine 03-08-2024.pptx
human mycosis Human fungal infections are called human mycosis..pptx
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
master seminar digital applications in india
Institutional Correction lecture only . . .
Week 4 Term 3 Study Techniques revisited.pptx
Abdominal Access Techniques with Prof. Dr. R K Mishra
O7-L3 Supply Chain Operations - ICLT Program
Classroom Observation Tools for Teachers
102 student loan defaulters named and shamed – Is someone you know on the list?
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
TR - Agricultural Crops Production NC III.pdf
Anesthesia in Laparoscopic Surgery in India
Pharmacology of Heart Failure /Pharmacotherapy of CHF

The world of research data: when should data be closed, shared or open

  • 1. The world of research data: when should data be closed, shared or open Dr Heila Pienaar Deputy Director: Strategic Innovation UP Library Services
  • 2. Content – Physical characteristics of research data before it can be shared – Modes of data sharing – Case study: public humiliation in the name of Open Science – Advantages and disadvantages of sharing research data – AI to the rescue of open research articles? – In conclusion
  • 3. Characteristics of research data before it can be shared – Not all research data are digitally born – Research data must be digital in order to be able to preserve & share it (this is often not the case) – Research data must be uploaded to a digital system / database / repository to make general sharing possible – Research data must be packaged in a format for ease of download and use (meta-data; provenance) – Analysis software programmes should be provided with the research data (with instructions)
  • 5. Modes of data sharing – Once data is collected and analysed, it can be released in a variety of ways: – closed networks within an organisation or project – to researchers working on the same topic (‘invisible colleges’) – to platform-dependent sharing between peer organisations – to publishing with closed licenses – to publishing openly but with permission – to publishing with fully open licenses
  • 6. Case study: public humiliation in the name of Open Science – How freely should scientists share their data – a case study by Daniel Barron on August 13, 2018 https://www.natureindex.com/news-blog/how-freely-should- scientists-share-their-data – Jack Gallant is a cognitive neuroscientist at the University of California, Berkeley – In 2011 he showed that he could—based only on measures of brain activity— actually reconstruct images of movies people were watching – In 2016 he showed what listening to the Moth podcast does to our brains. His analysis of the Moth podcast was published in Nature (Moth podcast – true stories told live)
  • 9. Publicly humiliated on Twitter – Manilo De Domenico a theoretical physicist, tweeted, “We keep trying to ask access to data used in your nature 2016, but we received not a single reply, yet.” – Gallant replied. “The original authors are still writing further primary research papers on these data so they haven't been released yet but we expect to be able to do that very soon.” – “‘We still want exclusivity to publish more papers’ isn’t a great excuse. Did you note data restrictions in the manuscript?” tweeted Andre Brown, referring to Nature’s policy that, on publication, authors should make their data, code and protocols “promptly” and publicly available
  • 10. • Domenico lamented that Gallant’s paper had given him a series of ideas that he wanted to test but couldn’t because he needed Gallant’s data, “This is not advancing human knowledge,” de Domenico asserted. • Gallant dug in: “And why do you assume that your project is better than the ones that we are continuing with these data? My students and postdocs are an awesome group of people, the stuff they have in the pipeline is great! But I can’t afford for them to be scooped.”
  • 11. Barron (blog writer) questions the ideals of Open Science – A highly-productive lab writes a grant to fund a series of studies and the development of new tools. They spend years collecting data and building the tools for these proposed studies – Then, they finish a portion of the project and begin to publish results. Should they be required to release their data to the community? If so, when? Who owns that data? And what business do journals have in enforcing data sharing? – Barron is questioning the role of publishers and journals in the open data debate – is it their role to force the sharing of research data? Barron remains unconvinced that there is an immediacy to sharing most forms of scientific data—especially an immediacy in the name of the public interest – I am convinced that other scientists feel an immediate need to analyse data sets that they do not own—especially if the results of a particularly excellent data set can be published in Nature and make them famous
  • 12. I also heard cautionary tales that the Open Science movement had a dark side, that “openness” had, at times, devolved into bullying and theft. Some compared the Open Science movement to Communism: good in principle, impossible in practice. In informal settings — at dinner, over drinks — I was reminded that science was a competitive business
  • 13. Advantages & disadvantages of sharing research data – It seems that there are two types of research results that should be shared as soon as it is possible, i.e. clinical trial data and rare-diseases research. – A framework for responsible sharing of clinical trial data has been developed in order to maximize benefits to participants in clinical trials and to society, while minimizing harm. https://www.ncbi.nlm.nih.gov/books/NBK253390/ – Rare-disease families’ access to medical research is very problematic. If these families can’t afford subscription fees they have to navigate a variety of gatekeeping mechanisms in order to access research that could help them make critical health decisions. “Yes, everyone should have rainbows, unicorns, & puppies delivered to their doorstep by volunteers. Y’all keep wishing for that, I’ll keep working on producing the best knowledge and distributing it as best we can.” Quote by an Elsevier official – later retracted …. https://slate.com/technology/2018/08/who-gets-to-read-the-research- taxpayers-fund.html
  • 14. Advantages – To maximize the impact of data (or conclusions drawn from it). Scientists base their research on foundations laid by other scientists, e.g. scientific theories proven by other scientists and research data collected by other scientists. – Sharing data reduces both the cost of data collection and the overall cost of research – The use of common data bases allows scientists to test and retest their findings against those of other scientists and promotes progress in science http://sites.nationalacademies.org/cs/groups/pgasite/documents/webpage/pga_053478.pdf – Inform collaboration – Provide stronger evidence for advocacy – Increase efficiency of service delivery among a wider audience than just within your project – Play role in decision making within other projects – Publishing your data can also allow people who might not have otherwise been well-informed enough about your project, to have a say - for example, those who are reflected in the data
  • 15. Disadvantages – Once data is published, it’s impossible to anticipate how it might be shared further, and once it’s out in the open, there’s no telling how it will be adopted, re-purposed and re-used for any number of purposes – Ethical and practical implications, e.g. a group of people could be identified in spite of anonymising personal information – Participants seeing that their data is used in a way they don’t agree with, or that puts them in danger – Researchers should carefully consider the implications of sharing, whether to share at all, and the licensing conditions or terms and tools that you can use to reduce the risk of harm, while still permitting beneficial outcomes https://responsibledata.io/resources/handbook/chapters/chapter-02c-sharing- data.html
  • 16. AI to the rescue of open research articles, and perhaps open research data at a later stage? – Researchers who are busy with specialised research sometimes query the open access movement as they are of the opinion that few people will understand their work – Impactstory to the rescue - Get The Research: Impactstory announces a new Science-Finding tool for the general public (https://gettheresearch.org) – It is aimed at serving the general public rather than an audience of scholars and specialists, and it promises to provide a new level of accessibility to published scholarship. – It will be built on the 20 million open access articles in the Unpaywall (http://unpaywall.org) index, and feature AI-powered tools that help make the content and context of scholarly articles more clear to readers. https://scholarlykitchen.sspnet.org/2018/11/12/get-the-research-impactstory- announces-a-new-science-finding-tool-for-the-general-public/
  • 17. In conclusion Sharing of data is highly dependent on the type of research, the type of data and most importantly, the requirements of the funder for that specific research. Open data / Open Science should not be cast in stone, but should be implemented with great care in order not to damage the scientific enterprise