SlideShare a Scribd company logo
Managing sensitive data in your
repository
Natasha Simons
Sharing Health-y and Sensitive Data: Challenges and Solutions Workshop
Perth 3 September 2015
What is a data repository?
1
A research data repository is a
managed environment capable of
storing and sharing (largely)
digital data. The data repository
supports the process of curating,
preserving, and sharing research
data.
What kinds of data repositories are there?
2
Are repositories for open data only?
3
Yes and no….because it depends on the purpose/scope
Repositories can support data that is:
1. Open access only
2. Mediated access only
3. Closed/private only
Most data repositories are a combination of 1 & 2
Are there health data repositories?
4
Yes, many!
http://www.nlm.nih.gov/NIHbmic/nih_data_sharing_repositories.html
What’s the point of data repositories?
5
Data repositories assist researchers and
the research community to:
1. Support data sharing, data discovery &
reuse, data preservation
2. Comply with publisher requirements
3. Comply with funder requirements
4. Comply with institutional or govt policy
requirements
5. Support institutional goals Illustration credit: Ainsley Seago. doi:10.1371/journal.pbio.1001779.g001
Can sensitive data be managed in a repository?
6
Yes!
Ask:
• Can the raw data be (de-identified and)
made completely open? Or will access be
restricted? Mediated?
• What licence should be applied to enable
data reuse?
• What metadata elements, links (e.g. to
publications) and identifiers (e.g. DOIs,
ORCIDs) will aid discovery and reuse of the
data? Source: http://www.slideshare.net/WLSA_ORG/wh2014-workshop-health-data-consortium
Can sensitive data be managed in a repository?
7
Also ask:
• Can a citation element be added to
support attribution and reuse
tracking?
• Who/what will be the method of
contact for the data?
• Are there other conditions that the
data is subject to e.g. release subject
to an embargo period?
Examples of sensitive data in repositories?
8
Examples of sensitive data in repositories?
9
Examples of sensitive data in repositories?
10
Examples of sensitive data in repositories?
11
Examples of sensitive data in repositories?
12
Examples of sensitive data in repositories?
What’s really challenging?
14
“Having longitudinal data on individuals is a part of many observational designs, and is
needed for research into outcomes, efficacy and many mechanistic studies. Most
repositories thus have longitudinal observations. To build such a database you need some
way to link observations on the same identified person. Therefore most repositories contain
personally identified data, but, because of privacy concerns, they often release only de-
identified data. Difficulties in the de-identification process can cause some data to be
omitted in a dataset. A lack of direct identifiers in a data collection or federation could
prevent linking of data for some patients.
From: Wade, T. Traits and Types of Health Data Repositories. Health Information Science
and Systems 2014, 2:4 doi:10.1186/2047-2501-2-4
http://www.hissjournal.com/content/2/1/4
Small group exercise
15
Discovering sensitive health data in repositories
Small group exercise
Acknowledgement
Australian National Data Service is funded by
the Commonwealth under the NCRIS Program
31 August, 2015 16

More Related Content

PPTX
The challenge of sharing data well, how publishers can help
PPTX
HLA PD Day 18 July 2016
PDF
The blessing and the curse: handshaking between general and specialist data r...
PPTX
ANDS presentation from Menzies HIQ Symposium: The Future of Data Sharing in a...
PPT
Licensing health and sensitive data
PDF
RSpace - Rory Macneil at Repository Fringe 2015
PDF
Scientific Data and peer review session at Dryad event, May 2015
PPTX
Introduction to ADA
The challenge of sharing data well, how publishers can help
HLA PD Day 18 July 2016
The blessing and the curse: handshaking between general and specialist data r...
ANDS presentation from Menzies HIQ Symposium: The Future of Data Sharing in a...
Licensing health and sensitive data
RSpace - Rory Macneil at Repository Fringe 2015
Scientific Data and peer review session at Dryad event, May 2015
Introduction to ADA

What's hot (20)

PPT
Journal Data Requirements
PDF
MLA 2022 My Favorite Tool: Airtable
PDF
Introduction to the Environmental Data Initiative (EDI)
PPTX
Findable, Accessible, Interoperable and Reusable (FAIR) data
PPTX
Long-term storage – will it fill up with the good stuff, or the big, bad, an...
PPTX
EDI Training Module 12: An Introduction to Metadata and Data Repositories
PDF
Introduction to the Research Integrity Advisor Data Management Workshop, Bris...
PPTX
Standardising research data policies, research data network
PPTX
THOR Workshop - Data Publishing Elsevier
PPTX
Research data spring: extending the OPD to cover RDM
PPTX
Data management and sharing principles for health and medical data: CDU
PPTX
Stop press: should embargo conditions apply to metadata?
PDF
OU Library Research Support webinar: Data sharing
PPTX
Pistoia Alliance US Conference 2015 - 1.3.4 New member introductions - Genexyx
PPTX
Strand 1: Connecting research and researchers: An introduction to ORCID by Ed...
PPT
ANDS: Increasing connections to add value
PPTX
Secure Lab at the UK Data Service
PPTX
The Economics of Data Sharing
PPTX
Elsevier‘s RDM Program: Ten Habits of Highly Effective Data
PDF
Lcewebinar rdm 5-steps_for_libraries
Journal Data Requirements
MLA 2022 My Favorite Tool: Airtable
Introduction to the Environmental Data Initiative (EDI)
Findable, Accessible, Interoperable and Reusable (FAIR) data
Long-term storage – will it fill up with the good stuff, or the big, bad, an...
EDI Training Module 12: An Introduction to Metadata and Data Repositories
Introduction to the Research Integrity Advisor Data Management Workshop, Bris...
Standardising research data policies, research data network
THOR Workshop - Data Publishing Elsevier
Research data spring: extending the OPD to cover RDM
Data management and sharing principles for health and medical data: CDU
Stop press: should embargo conditions apply to metadata?
OU Library Research Support webinar: Data sharing
Pistoia Alliance US Conference 2015 - 1.3.4 New member introductions - Genexyx
Strand 1: Connecting research and researchers: An introduction to ORCID by Ed...
ANDS: Increasing connections to add value
Secure Lab at the UK Data Service
The Economics of Data Sharing
Elsevier‘s RDM Program: Ten Habits of Highly Effective Data
Lcewebinar rdm 5-steps_for_libraries
Ad

Similar to Managing sensitive data in your repository (20)

PPTX
Introduction to research data management
PPTX
FSCI Sharing sensitive data
PDF
Brisbane Health-y Data: The guide to publishing and sharing sensitive data
PDF
Publishing and sharing sensitive data 28 June
PPTX
20160719 23 Research Data Things
PDF
Alain Frey Research Data for universities and information producers
PPTX
DataONE Education Module 02: Data Sharing
PPTX
Fsci 2018 thursday2_august_am6
PPTX
20160523 23 Research Data Things
PDF
Data Governance in two different data archives: When is a federal data reposi...
PPTX
The Landscape of Research Data Management
PPTX
The Landscape of Research Data Management
PPTX
Open science, open data - FOSTER training, Potsdam
PDF
Data sharing: How, what and why?
PPTX
Workshop - finding and accessing data - Cambridge August 22 2016
PDF
Open Science Governance and Regulation/Simon Hodson
PPTX
Public Data Archiving in Ecology and Evolution: How well are we doing?
PDF
Edge Informatics and FAIR (Findable, Accessible, Interoperable and Reusable) ...
PPTX
Research Data Management Services at UWA (November 2015)
PDF
big-data-and-data-sharing_ethical-issues.pdf
Introduction to research data management
FSCI Sharing sensitive data
Brisbane Health-y Data: The guide to publishing and sharing sensitive data
Publishing and sharing sensitive data 28 June
20160719 23 Research Data Things
Alain Frey Research Data for universities and information producers
DataONE Education Module 02: Data Sharing
Fsci 2018 thursday2_august_am6
20160523 23 Research Data Things
Data Governance in two different data archives: When is a federal data reposi...
The Landscape of Research Data Management
The Landscape of Research Data Management
Open science, open data - FOSTER training, Potsdam
Data sharing: How, what and why?
Workshop - finding and accessing data - Cambridge August 22 2016
Open Science Governance and Regulation/Simon Hodson
Public Data Archiving in Ecology and Evolution: How well are we doing?
Edge Informatics and FAIR (Findable, Accessible, Interoperable and Reusable) ...
Research Data Management Services at UWA (November 2015)
big-data-and-data-sharing_ethical-issues.pdf
Ad

More from ARDC (20)

PPTX
Architecture and Standards
PPTX
Data Sharing and Release Legislation
PPT
Australian Dementia Network (ADNet)
PPTX
Investigator-initiated clinical trials: a community perspective
PPTX
NCRIS and the health domain
PPTX
International perspective for sharing publicly funded medical research data
PPTX
Clinical trials data sharing
PPTX
Clinical trials and cohort studies
PPTX
Introduction to vision and scope
PPTX
FAIR for the future: embracing all things data
PDF
ARDC 2018 state engagements - Nov-Dec 2018 - Slides - Ian Duncan
PDF
Skilling-up-in-research-data-management-20181128
PDF
Research data management and sharing of medical data
PPTX
Applying FAIR principles to linked datasets: Opportunities and Challenges
PDF
How to make your data count webinar, 26 Nov 2018
PDF
Ready, Set, Go! Join the Top 10 FAIR Data Things Global Sprint
PDF
How FAIR is your data? Copyright, licensing and reuse of data
PDF
Peter neish DMPs BoF eResearch 2018
PPTX
Connected DMPs at UoA - we have a dream
PPTX
ReDBox and rdmps bof
Architecture and Standards
Data Sharing and Release Legislation
Australian Dementia Network (ADNet)
Investigator-initiated clinical trials: a community perspective
NCRIS and the health domain
International perspective for sharing publicly funded medical research data
Clinical trials data sharing
Clinical trials and cohort studies
Introduction to vision and scope
FAIR for the future: embracing all things data
ARDC 2018 state engagements - Nov-Dec 2018 - Slides - Ian Duncan
Skilling-up-in-research-data-management-20181128
Research data management and sharing of medical data
Applying FAIR principles to linked datasets: Opportunities and Challenges
How to make your data count webinar, 26 Nov 2018
Ready, Set, Go! Join the Top 10 FAIR Data Things Global Sprint
How FAIR is your data? Copyright, licensing and reuse of data
Peter neish DMPs BoF eResearch 2018
Connected DMPs at UoA - we have a dream
ReDBox and rdmps bof

Recently uploaded (20)

PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PPT
ISS -ESG Data flows What is ESG and HowHow
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PDF
Capcut Pro Crack For PC Latest Version {Fully Unlocked 2025}
PPTX
modul_python (1).pptx for professional and student
PPT
Reliability_Chapter_ presentation 1221.5784
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PDF
Optimise Shopper Experiences with a Strong Data Estate.pdf
PDF
annual-report-2024-2025 original latest.
PDF
Mega Projects Data Mega Projects Data
PDF
.pdf is not working space design for the following data for the following dat...
PPTX
Leprosy and NLEP programme community medicine
PPTX
Database Infoormation System (DBIS).pptx
PDF
Clinical guidelines as a resource for EBP(1).pdf
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
ISS -ESG Data flows What is ESG and HowHow
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
IBA_Chapter_11_Slides_Final_Accessible.pptx
oil_refinery_comprehensive_20250804084928 (1).pptx
Miokarditis (Inflamasi pada Otot Jantung)
Introduction-to-Cloud-ComputingFinal.pptx
Capcut Pro Crack For PC Latest Version {Fully Unlocked 2025}
modul_python (1).pptx for professional and student
Reliability_Chapter_ presentation 1221.5784
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
Optimise Shopper Experiences with a Strong Data Estate.pdf
annual-report-2024-2025 original latest.
Mega Projects Data Mega Projects Data
.pdf is not working space design for the following data for the following dat...
Leprosy and NLEP programme community medicine
Database Infoormation System (DBIS).pptx
Clinical guidelines as a resource for EBP(1).pdf
Data_Analytics_and_PowerBI_Presentation.pptx

Managing sensitive data in your repository

  • 1. Managing sensitive data in your repository Natasha Simons Sharing Health-y and Sensitive Data: Challenges and Solutions Workshop Perth 3 September 2015
  • 2. What is a data repository? 1 A research data repository is a managed environment capable of storing and sharing (largely) digital data. The data repository supports the process of curating, preserving, and sharing research data.
  • 3. What kinds of data repositories are there? 2
  • 4. Are repositories for open data only? 3 Yes and no….because it depends on the purpose/scope Repositories can support data that is: 1. Open access only 2. Mediated access only 3. Closed/private only Most data repositories are a combination of 1 & 2
  • 5. Are there health data repositories? 4 Yes, many! http://www.nlm.nih.gov/NIHbmic/nih_data_sharing_repositories.html
  • 6. What’s the point of data repositories? 5 Data repositories assist researchers and the research community to: 1. Support data sharing, data discovery & reuse, data preservation 2. Comply with publisher requirements 3. Comply with funder requirements 4. Comply with institutional or govt policy requirements 5. Support institutional goals Illustration credit: Ainsley Seago. doi:10.1371/journal.pbio.1001779.g001
  • 7. Can sensitive data be managed in a repository? 6 Yes! Ask: • Can the raw data be (de-identified and) made completely open? Or will access be restricted? Mediated? • What licence should be applied to enable data reuse? • What metadata elements, links (e.g. to publications) and identifiers (e.g. DOIs, ORCIDs) will aid discovery and reuse of the data? Source: http://www.slideshare.net/WLSA_ORG/wh2014-workshop-health-data-consortium
  • 8. Can sensitive data be managed in a repository? 7 Also ask: • Can a citation element be added to support attribution and reuse tracking? • Who/what will be the method of contact for the data? • Are there other conditions that the data is subject to e.g. release subject to an embargo period?
  • 9. Examples of sensitive data in repositories? 8
  • 10. Examples of sensitive data in repositories? 9
  • 11. Examples of sensitive data in repositories? 10
  • 12. Examples of sensitive data in repositories? 11
  • 13. Examples of sensitive data in repositories? 12
  • 14. Examples of sensitive data in repositories?
  • 15. What’s really challenging? 14 “Having longitudinal data on individuals is a part of many observational designs, and is needed for research into outcomes, efficacy and many mechanistic studies. Most repositories thus have longitudinal observations. To build such a database you need some way to link observations on the same identified person. Therefore most repositories contain personally identified data, but, because of privacy concerns, they often release only de- identified data. Difficulties in the de-identification process can cause some data to be omitted in a dataset. A lack of direct identifiers in a data collection or federation could prevent linking of data for some patients. From: Wade, T. Traits and Types of Health Data Repositories. Health Information Science and Systems 2014, 2:4 doi:10.1186/2047-2501-2-4 http://www.hissjournal.com/content/2/1/4
  • 16. Small group exercise 15 Discovering sensitive health data in repositories Small group exercise
  • 17. Acknowledgement Australian National Data Service is funded by the Commonwealth under the NCRIS Program 31 August, 2015 16