SlideShare a Scribd company logo
… because good research needs good data




 Introduction to Research Data
Management: activities, roles and
         requirements
                                         Michael Day
                                   Digital Curation Centre
                                  UKOLN, University of Bath
                                    m.day@ukoln.ac.uk


      This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 2.5 UK: Scotland
      License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/2.5/scotland/ ; or,
      (b) send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA.       Funded by:


Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
… because good research needs good data



Outline
    • Introduction
    • The researcher perspective
        • Codes of Practice
        • Research funding bodies
    • The institutional perspective
    • Research lifecycles
        • Some lifecycle models
        • The role of the library
    • Activities, roles and requirements


                                                               Funded by:


 Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
… because good research needs good data



Why manage research data?
 • Enable reuse
 • Research integrity
 • Research impact
    • Linking data and publication
    • Making data citable
 • Regulatory requirements
 • Controlling costs
 • Maximising value

                                                               Funded by:


 Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
… because good research needs good data



Who are the main actors?
 •     Researchers - as creators and users
 •     Other Data creators
 •     Other Data (re)users
 •     Funding bodies
 •     Data Centres
 •     Computer science research
 •     Libraries
 •     Research support/grant offices
 •     Archivists/records managers
                                                                   Funded by:


     Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
… because good research needs good data



What is required?
 • Technical infrastructure
    •   Storage (many options)
    •   Tools
    •   Discovery
    •   Research Intelligence (RIM)
 • Policy & commitment
 • Human infrastructure
    • Researcher skills
    • Support services
    • Training

                                                               Funded by:


 Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
… because good research needs good data



Potential national-level actions
 •     Building dataset discovery
 •     Collecting data policies
 •     Liaise with other national & international actors
 •     Support uptake of cloud-based tools
 •     Exploit pool of data plans
 •     Collecting stories on data re-use
 •     Supporting effective citation, referencing, etc
 •     Sharing good practice

                                                                   Funded by:


     Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
… because good research needs good data



The researcher perspective
 • Managing and sharing data is simply part of good
   research:
    • Adhering to disciplinary and/or institutional codes of practice
      and policies
    • Has been practiced since the advent of modern science, but
      not always consistently; data intensive research makes it
      even more critical
    • Meeting the specific requirements of funding bodies
 • Reputational risks if data management is not handled
   properly
                                                               Funded by:


 Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
… because good research needs good data



Research codes of practice (1)
 • UK Research Integrity Office Code of Practice for
   Research (2009)
       Data management planning is an essential part of research
       design
       Organisations should have in place procedures, resources
       (including physical space) and administrative support to
       assist researchers in the accurate and efficient collection of
       data and its storage in a secure and accessible form [3.12.5]




                                                               Funded by:


 Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
… because good research needs good data



Research codes of practice (2)
 • RCUK Code of Conduct on the Governance of Good
   Research Conduct (2011)
       Primary data and research evidence [should be made]
       accessible to others for reasonable periods after the
       completion of the research: data should normally be
       preserved and accessible for 10 yrs (in some cases 20 yrs or
       longer)
       Responsibility for proper management and preservation of
       data and primary materials is shared between the researcher
       and the research organisation [although deposit within
       national collections is endorsed]

                                                               Funded by:


 Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
… because good research needs good data



Research funding bodies
 • UK Research Councils
    • Help fund some data archives, e.g.:
       • Archaeology Data Service, European Bioinformatics
         Institute, the NERC data centres, UK Data Archive
    • Support for JISC (and DCC)
    • RCUK Common Principles on Data Policy
       • Recognises that data are a critical output of the research
         process
               http://www.rcuk.ac.uk/research/Pages/DataPolicy.aspx



                                                                      Funded by:


 Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
… because good research needs good data



RCUK Principles (in a nutshell)
 •     Publicly funded research data should be made openly available
 •     Data with acknowledged long-term value should be preserved and
       remain accessible and usable for future research
 •     Sufficient metadata should be recorded to enable other researchers to
       find and understand the research to enable re-use; published results
       should always include information on how to access the supporting data
 •     Recognition that there may be legal, ethical and commercial constraints
 •     Recognition that researchers may need privileged use of data for a
       limited period
 •     All users of research data should acknowledge their sources
 •     Appropriate to use public funds to support MRD


                                                                   Funded by:


     Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
… because good research needs good data



EPSRC expectations
 • Roadmap approved May 2012; compliance by May
   2015
       Appropriate metadata (including unique IDs) to be made freely
       available on the Internet within 12 months of data generation
       Data not generated in digital format should be stored in a manner to
       facilitate it being shared
       Data should be securely preserved for a minimum of 10 years after
       privileged access expires or the last date access was requested by
       a third party
       Adequate resources from existing funding streams
       EPSRC will monitor progress and compliance, and reserves the
       right to impose appropriate sanctions
                                                                Funded by:


 Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
… because good research needs good data



Implications for researchers
 •     Increasing number of research councils and funding bodies with data
       management and sharing requirements
 •     Potential loss of research income if these mandates are not met
 •     Need to determine the costs associated with short and longer-term
       management and curation and to request funds as part of grant
 •     Responsibility for infrastructure shifting more to HEIs and less to
       centralised data archives, but institutional infrastructures and services
       are still emerging
 •     Need guidance - some good external support
 •     But also need more local support; often fragmented (need to draw upon
       existing channels within your institution wherever possible)

                                                                       Funded by:


     Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
… because good research needs good data



Institutional drivers
 • Safeguarding research integrity
 • Increasing number of FOI requests for data
 • Adhering to existing codes of research practice and ethics
 • Developing new institution-wide strategies, policies and services
   for data storage and management
 • Increased institutional focus on research management (e.g., in
   response to REF)
 • Benchmarking – self-assessing infrastructure and planning for
   improvement
 • More demands but less resources to work with

                                                                   Funded by:


     Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
… because good research needs good data



Institutional actors
 • Researchers
     • Both as creators and users of data
     • PIs (e.g., have specific roles WRT grants)
     • Computer scientists (informaticians, data scientists)
 • Administration
     • Research support office (e.g., grants support, research
       information management)
     • Records managers, archivists, FOI office
 • Central services
     • Computing services
     • Libraries (e.g., institutional repository)
                                                                Funded by:


  Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
… because good research needs good data




        Research data lifecycles




                                                              Funded by:


Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
… because good research needs good data




                                                              Funded by:


Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
… because good research needs good data

 (e)-Research Life Cycle view of Data Curation?
                                                                       Formulate hypothesis / ideas, test,
 (New) knowledge                          Data processing             experiment, observe: data creation,
 extraction: data
                                                                                     collection & capture
 mining, modelling,
 analysis, synthesis                                                             Data processing
           Data processing
                                                                                       Data management
                                                 e-Infrastructure                    storage & validation:
     Adding value: Data
                                                                                     description, deposit,
     linking, annotation,                        Open access
                                                                                           self-archiving,
visualisation, simulation
                                                 Collaboration                              preservation,
                                                                                              certification
            Data processing
                                                                                Data processing


                        Scholarly communications: data disclosure,
                        publication, citation, discovery, re-use
           This work is licensed under a Creative Commons License                                  Funded by:
                                        Attribution-ShareAlike 2.0   Liz Lyon December 2005
      Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
… because good research needs good data


E-Science Curation Report - 2003
                                                             •    E-science
                                                                  discipline
                                                             •    Appropriate
                                                                  for current
                                                                  focus
                                                             •    Takes
                                                                  integrated
                                                                  look at higher
                                                                  education
                                                                  data curation
                                                                  problems
                                                             •    Granularity
                                                                  on curation
                                                                  activities?
                                                                 Funded by:


  Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
… because good research needs good data



Open Archival Information System




                                                               Funded by:


 Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
… because good research needs good data


RDM at
Oxford




                                                               Funded by:


 Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
… because good research needs good data


Research360@Bath
• New institutional data
  scientist role
• Addresses EPSRC
  expectations (published)
• Doctoral Training Centre
  hubs
• Faculty-Industry focus
• Faculty cascade model
• Multi-team approach


         http://blogs.bath.ac.uk/research360/                     Funded by:


    Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
… because good research needs good data




                                                              Funded by:


Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
… because good research needs good data



Some library roles (in the lifecycle)
 •     Leadership – coordinate action
 •     Audit – who has what, where does it go?
 •     Advice on access – data, wherever it is
 •     Preservation (long-term access requirements)
 •     Citability
 •     Data/publication linking
 •     Promoting data in teaching
 •     Identifying skill gaps / CPD requirements

                                                                   Funded by:


     Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
… because good research needs good data


Re-skilling for research (RLUK, 2012)
 •     Mary Auckland identified 9 key areas with skill gaps for
       subject librarians:
        •   Ability to advise on preserving research outputs
        •   Knowledge to advise on data management and
            curation, including ingest, discovery, access,
            dissemination, preservation, and portability
        •   Knowledge to support researchers in complying with the
            various mandates of funders, including open access
            requirements
        •   Knowledge to advise on potential data manipulation
            tools used in the discipline/ subject
        •   Knowledge to advise on data mining
        •   Knowledge to advocate, and advise on, the use of
            metadata
        •   Ability to advise on the preservation of project records
            e.g. correspondence
        •   Knowledge of sources of research funding to assist
            researchers to identify potential funders
        •   Skills to develop metadata schema, and advise on
            discipline/subject standards and practices, for
            individual research projects
                                                                       Funded by:


     Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
… because good research needs good data



   Understanding data requirements




http://www.dcc.ac.uk/

                                                                     Funded by:


       Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
… because good research needs good data



Data management planning




                                                               Funded by:


 Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
… because good research needs good data



Data registries
 • Findable, citable data has value
     •   Important to link publications to data (and vice versa)
     •   Increases citations – of data & publication
     •   Increases reuse (hence value)
     •   But effects exist even without publication
     •   All benefit – researcher; institution; publisher




                                                                Funded by:


  Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
… because good research needs good data



       Tools to track impact
                                              http://total-impact.org/




                                                              Funded by:


Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
… because good research needs good data



Activities, roles, requirements (1)
 • Requirements gathering
     • Identifying researchers’ data requirements
     • Developing a shared understanding of what needs to be
       done (e.g., identifying where data exist, its form and scale,
       any existing retention requirements)
     • Identifying good practice within the institution (and the
       opposite)
     • Methods: surveys, focus groups, case studies, joint R&D
       projects, assessment tools (e.g. DAF)



                                                                Funded by:


  Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
… because good research needs good data



Activities, roles, requirements (2)
 • Identifying motivations and benefits
     • For researchers, support services, the institution
 • Identifying risks
     • Data loss (institution, research group, individual)
     • Increased costs (lack of planning, service inefficiency, data
       loss)
     • Legal compliance (research funder, H&S, ethics, FoI)
     • Reputation (institution, unit, individual)
 • Identifying costs
     • Keeping Research Data Safe (KRDS) toolkit
                                                                Funded by:


  Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
… because good research needs good data



Activities, roles, requirements (3)
 • Assessing institutional preparedness
     • Identifying institutional stakeholders, existing data support services,
       gaps
     • Benchmarking and planning for the future
     • Skills audit
     • CARDIO tool
 • Policy development
     • Policies – approval by senior management is just the start; policies
       need to be embedded in research practice and responsive to
       changing requirements
 • Data management planning
     • DMP online, DCC How-to Develop a Data Management Plan guide
                                                                   Funded by:


  Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
… because good research needs good data



Activities, roles, requirements (4)
 • Implementation and service development
     • Integrating where possible with existing services, e.g. IR,
       CRIS, VRE, HPC, cloud services, social media, etc.
     • Appraisal, deciding what needs to be kept and for how long
     • Storage choices – no one-size-fits-all solution, e.g. Bristol’s
       BluePeta petascale storage facility, Bath’s X-Drive approach,
       cloud approaches
     • Data documentation and metadata – layered approaches:
       top-level discovery (core metadata, collection/experiment-
       level?), role of standards like DCMI, CERIF, DDI, etc.


                                                                Funded by:


  Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
… because good research needs good data



Activities, roles, requirements (5)
 • Data issues:
     • Appraisal: selection criteria, retention periods (who decides?)
         • DCC How to appraise and select research data for
           curation guide
     • Documentation: metadata, schema, semantics
     • Formats: proprietary formats, community standards, etc.
     • Provenance and authenticity
     • Citation (assignment of persistent IDs?)
     • Access (embargo policies?)
     • Licensing
         • DCC How to license research data guide
                                                                Funded by:


  Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
… because good research needs good data



Things to do …
 • Create policy – collaborate with others
        • Growing number of policies being published (EPSRC,
          Wellcome Trust)
 • Build on existing digital services
        • Examples: storage, data registry
 •     Learn about audit tools (DCC & others)
 •     Learn about data & sources
 •     Re-skill subject librarians
 •     Bridge between publishers & researchers
                                                                   Funded by:


     Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
What data to keep
                                … because good research needs good data



DCC resources




                  http://www.dcc.ac.uk/resources               Funded by:


 Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
… because good research needs good data




     Thank-you. Any questions?

                                         Michael Day
                                   Digital Curation Centre
                                  UKOLN, University of Bath
                                    m.day@ukoln.ac.uk


      This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 2.5 UK: Scotland
      License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/2.5/scotland/ ; or,
      (b) send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA.       Funded by:


Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012

More Related Content

PPTX
Clinical data management web based data capture edc & rdc
PPT
Research methodology introduction ch1
PPT
Research Methodology lecture-01
PPT
Qualitative Data Analysis
PPTX
Research Ethics and Integrity: How COPE can help
PPT
Data indexing presentation
PPTX
publication misconduct.pptx
Clinical data management web based data capture edc & rdc
Research methodology introduction ch1
Research Methodology lecture-01
Qualitative Data Analysis
Research Ethics and Integrity: How COPE can help
Data indexing presentation
publication misconduct.pptx

What's hot (20)

PPT
Literature review (1)
PPTX
What Is Peer Review?
PPTX
Qualitative research by Dr. Subraham Pany
PPTX
scientific misconduct
PPT
Research Methodology
PDF
Avoid Scientific Misconduct
PDF
Scientific Misconduct
PPTX
Introduction to Research
PPT
Statistical Methods in Library and Information Science: A Fundamental Approach
PPTX
Publication Ethics: Overview
PPTX
Webometrics
PPTX
Database indexing techniques
PPTX
Protocol Understanding_ Clinical Data Management_KatalystHLS
PPTX
scientific misconducts.pptx
PPTX
Ppt of mixed method design
PPTX
Scientometrics
PPTX
Scientific communication
Literature review (1)
What Is Peer Review?
Qualitative research by Dr. Subraham Pany
scientific misconduct
Research Methodology
Avoid Scientific Misconduct
Scientific Misconduct
Introduction to Research
Statistical Methods in Library and Information Science: A Fundamental Approach
Publication Ethics: Overview
Webometrics
Database indexing techniques
Protocol Understanding_ Clinical Data Management_KatalystHLS
scientific misconducts.pptx
Ppt of mixed method design
Scientometrics
Scientific communication
Ad

Similar to Introduction to research data management (20)

PPTX
Michael Day JIBS-RLUK event July 2012
PDF
Introduction to Research Data Management: activities, roles and requirements
PDF
Research data challenge presentation
PPTX
Research data management: definitions, drivers and resources
PPTX
EPFL Open Research Data - a Jisc perspective
PPT
Digital Curation 101 (University of Glamorgan)
PPT
Bloomsbury Conference
PPTX
Managing and Sharing Research Data: Good practices for an ideal world...in th...
PDF
Stewardship data-guidelines- research information network jan 2008
PDF
The current challenges of upgrading the infrastructure
PDF
Supporting Research Data Management at the University of Stirling
PPT
User engagement in research data curation
PPT
What is Research Data Management? UAL
PPTX
Introduction to Research Data Management
PPT
Improving Access to Research Data: What does changing legislation mean for y...
PPTX
Uncovering research - what's the standard - Jisc Digital Festival 2015
PDF
Simon Hodson
PDF
Engaging with students and researchers: the case of the social sciences
PDF
Research Data Management Inititatives at University of Edinburgh
Michael Day JIBS-RLUK event July 2012
Introduction to Research Data Management: activities, roles and requirements
Research data challenge presentation
Research data management: definitions, drivers and resources
EPFL Open Research Data - a Jisc perspective
Digital Curation 101 (University of Glamorgan)
Bloomsbury Conference
Managing and Sharing Research Data: Good practices for an ideal world...in th...
Stewardship data-guidelines- research information network jan 2008
The current challenges of upgrading the infrastructure
Supporting Research Data Management at the University of Stirling
User engagement in research data curation
What is Research Data Management? UAL
Introduction to Research Data Management
Improving Access to Research Data: What does changing legislation mean for y...
Uncovering research - what's the standard - Jisc Digital Festival 2015
Simon Hodson
Engaging with students and researchers: the case of the social sciences
Research Data Management Inititatives at University of Edinburgh
Ad

More from Michael Day (20)

PDF
What can libraries do for researchers?
PDF
Preservation planning at the British Library
PDF
Implementing digital preservation strategy: collection profiling at the Briti...
PDF
Developing institutional RDM services
PDF
Open access data
PDF
Digital Preservation (UWE)
PDF
Continuity and change: Opportunities and challenges for the future of researc...
PDF
Developing a Community Capability Model Framework for data-intensive research
PPT
Digital Preservation
PPT
UKOLN activities on research information management
PDF
UKOLN Programme Support for the JISC Research Information Management Programme
PPT
Digital Preservation
PDF
EASTER project
PDF
Models for integrating institutional repositories and research information ma...
PDF
Research Information Management
PPT
Digital preservation exercises
PPT
Brief Introduction to Digital Preservation
PPT
Curation of Research Data
PDF
Digital preservation from a records management perspective
PDF
The Improving Access to Text (IMPACT) project and other European initiatives
What can libraries do for researchers?
Preservation planning at the British Library
Implementing digital preservation strategy: collection profiling at the Briti...
Developing institutional RDM services
Open access data
Digital Preservation (UWE)
Continuity and change: Opportunities and challenges for the future of researc...
Developing a Community Capability Model Framework for data-intensive research
Digital Preservation
UKOLN activities on research information management
UKOLN Programme Support for the JISC Research Information Management Programme
Digital Preservation
EASTER project
Models for integrating institutional repositories and research information ma...
Research Information Management
Digital preservation exercises
Brief Introduction to Digital Preservation
Curation of Research Data
Digital preservation from a records management perspective
The Improving Access to Text (IMPACT) project and other European initiatives

Recently uploaded (20)

PDF
Empathic Computing: Creating Shared Understanding
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Electronic commerce courselecture one. Pdf
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
cuic standard and advanced reporting.pdf
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Modernizing your data center with Dell and AMD
PPTX
A Presentation on Artificial Intelligence
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Encapsulation theory and applications.pdf
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Empathic Computing: Creating Shared Understanding
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Electronic commerce courselecture one. Pdf
Advanced methodologies resolving dimensionality complications for autism neur...
cuic standard and advanced reporting.pdf
Chapter 3 Spatial Domain Image Processing.pdf
Building Integrated photovoltaic BIPV_UPV.pdf
Understanding_Digital_Forensics_Presentation.pptx
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Reach Out and Touch Someone: Haptics and Empathic Computing
Modernizing your data center with Dell and AMD
A Presentation on Artificial Intelligence
NewMind AI Weekly Chronicles - August'25 Week I
Encapsulation theory and applications.pdf
Mobile App Security Testing_ A Comprehensive Guide.pdf
The AUB Centre for AI in Media Proposal.docx
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Diabetes mellitus diagnosis method based random forest with bat algorithm
NewMind AI Monthly Chronicles - July 2025
How UI/UX Design Impacts User Retention in Mobile Apps.pdf

Introduction to research data management

  • 1. … because good research needs good data Introduction to Research Data Management: activities, roles and requirements Michael Day Digital Curation Centre UKOLN, University of Bath m.day@ukoln.ac.uk This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 2.5 UK: Scotland License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/2.5/scotland/ ; or, (b) send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA. Funded by: Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
  • 2. … because good research needs good data Outline • Introduction • The researcher perspective • Codes of Practice • Research funding bodies • The institutional perspective • Research lifecycles • Some lifecycle models • The role of the library • Activities, roles and requirements Funded by: Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
  • 3. … because good research needs good data Why manage research data? • Enable reuse • Research integrity • Research impact • Linking data and publication • Making data citable • Regulatory requirements • Controlling costs • Maximising value Funded by: Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
  • 4. … because good research needs good data Who are the main actors? • Researchers - as creators and users • Other Data creators • Other Data (re)users • Funding bodies • Data Centres • Computer science research • Libraries • Research support/grant offices • Archivists/records managers Funded by: Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
  • 5. … because good research needs good data What is required? • Technical infrastructure • Storage (many options) • Tools • Discovery • Research Intelligence (RIM) • Policy & commitment • Human infrastructure • Researcher skills • Support services • Training Funded by: Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
  • 6. … because good research needs good data Potential national-level actions • Building dataset discovery • Collecting data policies • Liaise with other national & international actors • Support uptake of cloud-based tools • Exploit pool of data plans • Collecting stories on data re-use • Supporting effective citation, referencing, etc • Sharing good practice Funded by: Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
  • 7. … because good research needs good data The researcher perspective • Managing and sharing data is simply part of good research: • Adhering to disciplinary and/or institutional codes of practice and policies • Has been practiced since the advent of modern science, but not always consistently; data intensive research makes it even more critical • Meeting the specific requirements of funding bodies • Reputational risks if data management is not handled properly Funded by: Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
  • 8. … because good research needs good data Research codes of practice (1) • UK Research Integrity Office Code of Practice for Research (2009) Data management planning is an essential part of research design Organisations should have in place procedures, resources (including physical space) and administrative support to assist researchers in the accurate and efficient collection of data and its storage in a secure and accessible form [3.12.5] Funded by: Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
  • 9. … because good research needs good data Research codes of practice (2) • RCUK Code of Conduct on the Governance of Good Research Conduct (2011) Primary data and research evidence [should be made] accessible to others for reasonable periods after the completion of the research: data should normally be preserved and accessible for 10 yrs (in some cases 20 yrs or longer) Responsibility for proper management and preservation of data and primary materials is shared between the researcher and the research organisation [although deposit within national collections is endorsed] Funded by: Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
  • 10. … because good research needs good data Research funding bodies • UK Research Councils • Help fund some data archives, e.g.: • Archaeology Data Service, European Bioinformatics Institute, the NERC data centres, UK Data Archive • Support for JISC (and DCC) • RCUK Common Principles on Data Policy • Recognises that data are a critical output of the research process http://www.rcuk.ac.uk/research/Pages/DataPolicy.aspx Funded by: Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
  • 11. … because good research needs good data RCUK Principles (in a nutshell) • Publicly funded research data should be made openly available • Data with acknowledged long-term value should be preserved and remain accessible and usable for future research • Sufficient metadata should be recorded to enable other researchers to find and understand the research to enable re-use; published results should always include information on how to access the supporting data • Recognition that there may be legal, ethical and commercial constraints • Recognition that researchers may need privileged use of data for a limited period • All users of research data should acknowledge their sources • Appropriate to use public funds to support MRD Funded by: Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
  • 12. … because good research needs good data EPSRC expectations • Roadmap approved May 2012; compliance by May 2015 Appropriate metadata (including unique IDs) to be made freely available on the Internet within 12 months of data generation Data not generated in digital format should be stored in a manner to facilitate it being shared Data should be securely preserved for a minimum of 10 years after privileged access expires or the last date access was requested by a third party Adequate resources from existing funding streams EPSRC will monitor progress and compliance, and reserves the right to impose appropriate sanctions Funded by: Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
  • 13. … because good research needs good data Implications for researchers • Increasing number of research councils and funding bodies with data management and sharing requirements • Potential loss of research income if these mandates are not met • Need to determine the costs associated with short and longer-term management and curation and to request funds as part of grant • Responsibility for infrastructure shifting more to HEIs and less to centralised data archives, but institutional infrastructures and services are still emerging • Need guidance - some good external support • But also need more local support; often fragmented (need to draw upon existing channels within your institution wherever possible) Funded by: Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
  • 14. … because good research needs good data Institutional drivers • Safeguarding research integrity • Increasing number of FOI requests for data • Adhering to existing codes of research practice and ethics • Developing new institution-wide strategies, policies and services for data storage and management • Increased institutional focus on research management (e.g., in response to REF) • Benchmarking – self-assessing infrastructure and planning for improvement • More demands but less resources to work with Funded by: Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
  • 15. … because good research needs good data Institutional actors • Researchers • Both as creators and users of data • PIs (e.g., have specific roles WRT grants) • Computer scientists (informaticians, data scientists) • Administration • Research support office (e.g., grants support, research information management) • Records managers, archivists, FOI office • Central services • Computing services • Libraries (e.g., institutional repository) Funded by: Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
  • 16. … because good research needs good data Research data lifecycles Funded by: Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
  • 17. … because good research needs good data Funded by: Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
  • 18. … because good research needs good data (e)-Research Life Cycle view of Data Curation? Formulate hypothesis / ideas, test, (New) knowledge Data processing experiment, observe: data creation, extraction: data collection & capture mining, modelling, analysis, synthesis Data processing Data processing Data management e-Infrastructure storage & validation: Adding value: Data description, deposit, linking, annotation, Open access self-archiving, visualisation, simulation Collaboration preservation, certification Data processing Data processing Scholarly communications: data disclosure, publication, citation, discovery, re-use This work is licensed under a Creative Commons License Funded by: Attribution-ShareAlike 2.0 Liz Lyon December 2005 Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
  • 19. … because good research needs good data E-Science Curation Report - 2003 • E-science discipline • Appropriate for current focus • Takes integrated look at higher education data curation problems • Granularity on curation activities? Funded by: Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
  • 20. … because good research needs good data Open Archival Information System Funded by: Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
  • 21. … because good research needs good data RDM at Oxford Funded by: Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
  • 22. … because good research needs good data Research360@Bath • New institutional data scientist role • Addresses EPSRC expectations (published) • Doctoral Training Centre hubs • Faculty-Industry focus • Faculty cascade model • Multi-team approach http://blogs.bath.ac.uk/research360/ Funded by: Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
  • 23. … because good research needs good data Funded by: Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
  • 24. … because good research needs good data Some library roles (in the lifecycle) • Leadership – coordinate action • Audit – who has what, where does it go? • Advice on access – data, wherever it is • Preservation (long-term access requirements) • Citability • Data/publication linking • Promoting data in teaching • Identifying skill gaps / CPD requirements Funded by: Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
  • 25. … because good research needs good data Re-skilling for research (RLUK, 2012) • Mary Auckland identified 9 key areas with skill gaps for subject librarians: • Ability to advise on preserving research outputs • Knowledge to advise on data management and curation, including ingest, discovery, access, dissemination, preservation, and portability • Knowledge to support researchers in complying with the various mandates of funders, including open access requirements • Knowledge to advise on potential data manipulation tools used in the discipline/ subject • Knowledge to advise on data mining • Knowledge to advocate, and advise on, the use of metadata • Ability to advise on the preservation of project records e.g. correspondence • Knowledge of sources of research funding to assist researchers to identify potential funders • Skills to develop metadata schema, and advise on discipline/subject standards and practices, for individual research projects Funded by: Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
  • 26. … because good research needs good data Understanding data requirements http://www.dcc.ac.uk/ Funded by: Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
  • 27. … because good research needs good data Data management planning Funded by: Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
  • 28. … because good research needs good data Data registries • Findable, citable data has value • Important to link publications to data (and vice versa) • Increases citations – of data & publication • Increases reuse (hence value) • But effects exist even without publication • All benefit – researcher; institution; publisher Funded by: Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
  • 29. … because good research needs good data Tools to track impact http://total-impact.org/ Funded by: Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
  • 30. … because good research needs good data Activities, roles, requirements (1) • Requirements gathering • Identifying researchers’ data requirements • Developing a shared understanding of what needs to be done (e.g., identifying where data exist, its form and scale, any existing retention requirements) • Identifying good practice within the institution (and the opposite) • Methods: surveys, focus groups, case studies, joint R&D projects, assessment tools (e.g. DAF) Funded by: Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
  • 31. … because good research needs good data Activities, roles, requirements (2) • Identifying motivations and benefits • For researchers, support services, the institution • Identifying risks • Data loss (institution, research group, individual) • Increased costs (lack of planning, service inefficiency, data loss) • Legal compliance (research funder, H&S, ethics, FoI) • Reputation (institution, unit, individual) • Identifying costs • Keeping Research Data Safe (KRDS) toolkit Funded by: Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
  • 32. … because good research needs good data Activities, roles, requirements (3) • Assessing institutional preparedness • Identifying institutional stakeholders, existing data support services, gaps • Benchmarking and planning for the future • Skills audit • CARDIO tool • Policy development • Policies – approval by senior management is just the start; policies need to be embedded in research practice and responsive to changing requirements • Data management planning • DMP online, DCC How-to Develop a Data Management Plan guide Funded by: Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
  • 33. … because good research needs good data Activities, roles, requirements (4) • Implementation and service development • Integrating where possible with existing services, e.g. IR, CRIS, VRE, HPC, cloud services, social media, etc. • Appraisal, deciding what needs to be kept and for how long • Storage choices – no one-size-fits-all solution, e.g. Bristol’s BluePeta petascale storage facility, Bath’s X-Drive approach, cloud approaches • Data documentation and metadata – layered approaches: top-level discovery (core metadata, collection/experiment- level?), role of standards like DCMI, CERIF, DDI, etc. Funded by: Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
  • 34. … because good research needs good data Activities, roles, requirements (5) • Data issues: • Appraisal: selection criteria, retention periods (who decides?) • DCC How to appraise and select research data for curation guide • Documentation: metadata, schema, semantics • Formats: proprietary formats, community standards, etc. • Provenance and authenticity • Citation (assignment of persistent IDs?) • Access (embargo policies?) • Licensing • DCC How to license research data guide Funded by: Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
  • 35. … because good research needs good data Things to do … • Create policy – collaborate with others • Growing number of policies being published (EPSRC, Wellcome Trust) • Build on existing digital services • Examples: storage, data registry • Learn about audit tools (DCC & others) • Learn about data & sources • Re-skill subject librarians • Bridge between publishers & researchers Funded by: Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
  • 36. What data to keep … because good research needs good data DCC resources http://www.dcc.ac.uk/resources Funded by: Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012
  • 37. … because good research needs good data Thank-you. Any questions? Michael Day Digital Curation Centre UKOLN, University of Bath m.day@ukoln.ac.uk This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 2.5 UK: Scotland License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/2.5/scotland/ ; or, (b) send a letter to Creative Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA. Funded by: Demystifying Research Data, JIBS/RLUK event, SOAS, London, 17 July 2012