SlideShare a Scribd company logo
Supporting Libraries in
Leading the Way in
Research Data Management
Marieke Guy, Institutional Support Officer,
Digital Curation Centre, UKOLN, University of Bath, UK

Email: m.guy@ukoln.ac.uk
Twitter Id: mariekeguy
Web: http://www.dcc.ac.uk

Online Information, 20th -21st November 2012



                                                 UKOLN is supported by:



           This work is licensed under a Creative Commons Licence
           Attribution-ShareAlike 2.0


  1
Who Am I?

    • Have worked for UKOLN for over 12 years
    • Worked on variety of projects:
      Subject portals project, IMPACT, Good APIs,
      JISC Observatory, cultural heritage work,
       digital preservation work, …etc
    • Remote worker, into amplified events
    • Co-chair of IWMW for a number of years

    • Now working for Digital Curation Curation
    • Institutional Support Officer helping HEIs with their RDM




2
Today’s Talk

     • Research data and why is it so important?

     • How research data is managed

     • What the DCC does

     • The role libraries are currently playing

     • The role libraries could be playing in the future




3
http://www.google.co.uk/imgres?q=illumina+bgi&hl=en&client=firefox-
                                               a&hs=Jl2&rls=org.mozilla:en-GB:official&biw=1366&bih




                                           Research Data
http://www.flickr.com/photos/think
mulejunk/352387473/



 http://www.flickr.com/photos/usf
 sregion5/4546851916




                                                                                             http://www.flickr.com/photos/wasp
                                     http://www.flickr.com/photos/charleswelch/3             _barcode/4793484478/
    4                                597432481/
What is Research Data?
    …whatever is produced in research or evidences its outputs




                                                        • Facts
                                                        • Statistics
                                                        • qualitative
                                                        • quantitative
                                                        • Not published
                                                          research output
      “highest priority research data is that which     • Discipline specific
5     underpins a research output”
A Data Present

         “Data underpins our economy and
         our society - data about how
         much is being spent and where,
         data about how schools, hospitals
         and police are performing, data
         about where things are and data
         about the weather.”

         Tim Berners Lee, director of W3C.



6
Big Data




                                             • Volume
                                             • Velocity
                                             • Variety



               “The 1000 Genomes Project generated more DNA
                “The 1000 Genomes Project generated more DNA
               sequence data in its first 6 months than GenBank
                sequence data in its first 6 months than GenBank
7              had accumulated in its entire 21 year existence”
                had accumulated in its entire 21 year existence”
A Data Future

         “The ability to take data - to
         be able to understand it, to
         process it, to extract value
         from it, to visualise it, to
         communicate it -that’s going to
         be a hugely important skill in
         the next decades.”

         Hal Varian, Google’s chief economist.



8
        Hal Varian, Chief Economist, Google
Big Data…and Small Data

     •   DIY data
     •   Consumer data
     •   Crowd Sourced data
     •   What about Linked data/
         Web of data/Open data?

     •   Databases
     •   Learning data
     •   Administrative data
     •   Long tail                                  data
                                 r os s  project: “
               JIS C MaRDI-G ast significant
                                 e le                     ce
               vo  lume is th resent context, sin
                                ep
                (is sue) in th               al problem
                                                        ”
                                        hnic
                it i s ‘ o nly’ a tec
9
Some Data Issues

      • Scale and complexity – data
        deluge – volume, pace
      • Infrastructure and management
        – Storage, costs & sustainability
      • Quality of data
      • Reputation – FOI, DPA, computer
        misuse
      • Openness agenda
      • Preservation
      • Working in partnerships
      • Funding for researchers



10
Funding…the Biggest Carrot/Stick?




     EPSRC expects all those institutions it funds:

•to develop a roadmap that aligns their policies and processes with
EPSRC’s expectations by 1st May 2012;
•to be fully compliant with these expectations by 1st May 2015.

•http://www.epsrc.ac.uk/about/standards/researchdata/Pages/expectat
ions.aspx


11
Data Policies of Funders




     http://www.dcc.ac.uk/resources/policy-and-legal/overview-funders-data-policies

12
What is Research Data Management?



                     “the active management and
                        appraisal of data over the
                        lifecycle of scholarly and
                            scientific interest”


                       Data management is part of
                         good research practice



13
How is Research Data Managed?
     Some areas to think about:

                                                             Leicester
                                                             University
                                                             Data
                                                             management
                                                             support for
                                                             researchers
                                                             Web site


     •   Storage & cloud          •   Curation
     •   Data repositories        •   Digital Preservation
     •   Metadata & citation      •   Migration
     •   File naming              •   Sharing/openness
     •   Appraisal, selection &   •   Security
         deletion                 •   Cost
14
RDM Activities

      What kind of activities are involved?
         – producing and sharing of data with research colleagues
            in collaborative environments (internal and external)
         – file naming
         – applying metadata for context and discovery
         – ensuring that sensitive data is not shared or accessible
         – cleaning data for longer-term use
         – selecting mechanisms for data capture and storage
         – selecting and appraising data for short and longer-term
            retention
         – licensing data for reuse
         – developing data management plans

      •Data management is about making informed decisions
15
The Digital Curation Centre

      • A consortium comprising units from the Universities of Bath
        (UKOLN), Edinburgh (DCC Centre) and Glasgow (HATII)
      • launched 1st March 2004 as a national centre for solving
        challenges in digital curation that could not be tackled by
        any single institution or discipline
      • Funded by JISC with additional HEFCE funding from 2011
        for the provision of support to national cloud services
      • Targeted institutional development
      • http://www.dcc.ac.uk/




16
Advocacy and Training


                        How to…

                             • Appraise and Select
                               Research Data
                             • Cite Datasets and Link to
                               Publications
                             • Develop a Data
                               Management and Sharing
                               Plan
                             • License Research Data
                             • Set a RDM service –
                               coming soon!



How to cite data
17
18
DCC Tools for Engagement

                                     Survey and interview
                                     methodology for
                                     investigating data holdings
                                     and how they are managed


      Capability model for establishing
         consensus on capabilities and
       gaps in current provision, rating
          organisation, technology and
                              resources

                                       Customised institutional
                                       templates for data
                                       management planning
19
Institutional Engagement Work

      • Funded by the HEFCE through its Universities
        Modernisation Fund (UMF)
      • Intensive, tailored support to increase research data
        management capability
      • Originally 18 Higher Education Institutions (HEIs) between
        Summer 2011 and Spring 2013
      • Can help:
         – win the support of senior management
         – understand current data practices
         – redesign data support services
         – Help with policy development and training




20
What Part are Libraries Playing?

      • RDM requires the input of all support services, but libraries
        are taking the lead in the UK
      • The library is leading on most of the DCC engagements

                                             Other examples include:

                                             –EDINA at University of

                   Library                      Information
                                             Edinburgh

                                            managementthe a
                                            –Bodleian Library at is
                                            University of Oxford
     Research                              key skill in RDM, so
                                            –Subject librarians at
      Office                               it’s a major role for
                                            University of Southampton
                         IT                       librarians
21
Why are Libraries Taking the Lead?

        Because libraries:
        •Often run publication repositories so are the stakeholder
        called on when questions are raised about the
        management of associated data
        •Have directed the open sharing of publications so are
        well placed to advice on how best to support data
        requirements
        •Have good relationships with researchers and good
        connections with other service departments
        •Have a highly relevant skill set



22
An Exciting Opportunity

              “Researchers need help to
              manage their data. This is a
              really exciting opportunity for
              libraries….”
          Liz Lyon, VALA 2012

      •   Leadership
      •   Providing tools and support
      •   Advocacy and training
      •   Developing data informatics capacity & capability



23
Reskilling for Research
      But librarians feel they lack appropriate skills…

 Skills gap                                               2-5 years Now
 Preserving research outputs                              49%      10%
 Data management & curation                               48%      16%
 Complying with funder mandates                           40%      16%
 Data manipulation tools                                  34%      7%
 Data mining                                              33%      3%
 Metadata                                                 29%      10%
 Preservation of project records                          24%      3%
 Sources of research funding                              21%      8%
 Metadata schema, disciplinary standards, practices       16%      2%

      From RLUK, Re-skilling for Research, Jan 2012, p43
24    Other surveys include DataOne, Cologne Uni, DigCurV
Specialist Knowledge

           “Very few librarians are likely to
           have specialist scientific or
           medical knowledge - if you train
           as a research scientist or a medic,
           you probably won’t become a
           librarian.”

           Mary Auckland: Reskilling for Research 2012,
           RLUK.



25
Knowledge Needed…

      • Librarians are overtaxed already, lack personal research
        experience, have little understanding of complexity and
        scale of issue
      • Need knowledge and understanding of:
         – Researchers’ practice and data holdings
         – Research Councils and funding bodies’ requirements
         – Disciplinary and/or institutional codes of practice and
            policies
         – Existing institutional policies and infrastructure
         – Reputational risks associated with poor data
            management – with respect to researchers’ reputations
            as well as that of their institutions
         – Data management and sharing benefits
         – Research data management tools and technologies
26
And Needed Fast…

                                                                    Implications of “Big Data”
                                                                    and data science for
                                                                    organisations in all sectors

                                                                    McKinsey Global Institute
                                                                    predicts a shortage of
                                                                    190,000 data scientists by
                                                                    2019




http://www.mckinsey.com/Insights/MGI/Research/Technology_and_Inno
 27
vation/Big_data_The_next_frontier_for_innovation
Is Retooling Possible?

          “Significant mismatches exist between research
          data and library digital warehouses, as well as
          the processes and procedures librarians typically
          use to fill those warehouses. Repurposing
          warehouses and staff for research data is
          therefore neither straightforward nor simple; in
          some cases, it may even prove impossible.”

          Salo, D. (2010) Retooling Libraries for the Data Challenge,
          Ariadne, Issue 64.

      •   Libraries are organised, research data isn’t
      •   Need technical systems such as sheer curation, better
          sharing of data and improved funding models
28
Possible Approaches

      • University of Helsinki Library – Knotworking “collaborative
        performance between otherwise loosely connected actors
        and activity systems”
      • University Burnaby, British Columbia - providing research
        data services since the 1970s – currently exploring funding
        gaps
      • Deutsche Nationalbibliothek - DP4lib project (Digital
        Preservation for libraries) where the library is acting as a
        service-broker for digital data curation
      • Research libraries - Opportunities for Data Exchange (ODE)
        project as an exemplar project, which gives shares
        emerging best practice
      • Data intelligence 4 librarians, Delft University of
        Technology
29
Training Librarians: RDMRose

      • JISC funded project to produce OER learning materials in
        RDM tailored for Information professionals
      • Led by Sheffield University iSchool
      • Practitioner community based on the White Rose University
        Consortium’s libraries at the Universities of Leeds,
        Sheffield and York
      • Deliverables include curriculum, module within taught
        masters course in Sheffield, self study version
      • Much of course concentrates on teaching librarians about
        research and the research process
      • RDMRose working with Stephen Pinfield on a web-based
        survey of current library RDM activity
      • http://www.sheffield.ac.uk/is/research/projects

30
Informatics Transform

     • Library & institutional stakeholders
     • Roles (7 listed): Responsibilities,
       Requirements, Relationships

     1. Director IS/CIO/University Librarian
     2. Data librarians /data scientist /
        liaison/subject/faculty librarians
     3. Repository managers
     4. IT/Computing Services
     5. Research Support/Innovation Office
     6. Doctoral Training Centres
     7. PVC Research

            Liz Lyon, Informatics Transform, Ariadne Issue 68, 2012
31
Partnership Approaches

        • Research 360,
          University of Bath:

            • UKOLN-DCC
            • Library
            • IT services
            • Research
              Support Office
            • Doctoral
              Training
              Centres
     http://blogs.bath.ac.uk/research360/


32
Embedded Librarians

         “Librarians may need to raise their
         profile, become ‘researchers’
         themselves; getting embedded in the
         research community;
         gaining credibility; and collaborating
         as equals.”

         Bent et al, 'Information literacy in a researcher's
         learning life' in New Review of Information
         Networking, 13 (2), 2007



33
So What Next?

      • Address the lack of data informatics skills
      • Mainstream data librarians & data scientists
      • Embed new skills into LIS & iSchool curriculum


      Lyon, ‘The Informatics Transform: re-engineering libraries for
      the data decade’ in IJDC, 7(1), 2012


                                                 hts the
                            an ag ement highlig               bracing
        Re search data m                n’s skillset. By em
             licability of the libraria       t, librarians
                                                              will
        ap p                           u ppo r
                        p rovide RDM s                  d as .
        the need to              of institutional agen
         remain   at the heart

34
Resources to Look at…

      • Riding the Wave report and many others emphasise the
        relevance of research data to current academic working


      •   RLUK/Mary Auckland: Reskilling
          for Research
      •   Sheila Corrall: Libraries,
          Librarians and Data
      •   DigCurv
      •   Book: Managing Research Data
      •   HEIs research data support
          pages



35
Thank You

      • Thanks to DCC colleagues for contributing to slide material.



                             Any questions?

                           m.guy@ukoln.ac.uk




36

More Related Content

PPT
What is Research Data Management? UAL
PDF
Supporting Research Data Management at the University of Stirling
PPTX
Introduction to Research Data Management
PPTX
Why manage research data?
PPT
PDF
The current challenges of upgrading the infrastructure
PDF
Joy davidson-rdm-support-ual
PPTX
Michael Day JIBS-RLUK event July 2012
What is Research Data Management? UAL
Supporting Research Data Management at the University of Stirling
Introduction to Research Data Management
Why manage research data?
The current challenges of upgrading the infrastructure
Joy davidson-rdm-support-ual
Michael Day JIBS-RLUK event July 2012

What's hot (20)

PPT
Facing the data challenge: Developing data policy and services
PDF
Introduction to Research Data Management: activities, roles and requirements
PPTX
From policy to practice with DMP Online
PPTX
Martin Donnelly Sarah Jones DMP Online
PPT
What is-rdm
PPTX
RDM LIASA webinar
PPTX
DMP health sciences
PDF
Challenges in setting up an RDM Support Service
PPT
Supporting-DMPs
PPTX
Data Exchange, Data Citation: An overview of some community work
PPTX
Cuna Ekmekcioglu (University of Edinburgh) - “Engaging academic support libra...
PPTX
Data Exchange, Data Citation: An overview of some community work
PPTX
LEARN Conference - How to cost
PPTX
Research Data Management in the Humanities and Social Sciences
PDF
Incentivizing data sharing: a "bottom up" perspective/Louise Bezuidenhout
PDF
When Search becomes Research and Research becomes Search
PPTX
RDM policy and recovering costs
PPTX
Why science needs open data – Jisc and CNI conference 10 July 2014
PPTX
Vitae tomorrows-researchers
PPTX
Meeting the Research Data Management Challenge - Rachel Bruce, Kevin Ashley, ...
Facing the data challenge: Developing data policy and services
Introduction to Research Data Management: activities, roles and requirements
From policy to practice with DMP Online
Martin Donnelly Sarah Jones DMP Online
What is-rdm
RDM LIASA webinar
DMP health sciences
Challenges in setting up an RDM Support Service
Supporting-DMPs
Data Exchange, Data Citation: An overview of some community work
Cuna Ekmekcioglu (University of Edinburgh) - “Engaging academic support libra...
Data Exchange, Data Citation: An overview of some community work
LEARN Conference - How to cost
Research Data Management in the Humanities and Social Sciences
Incentivizing data sharing: a "bottom up" perspective/Louise Bezuidenhout
When Search becomes Research and Research becomes Search
RDM policy and recovering costs
Why science needs open data – Jisc and CNI conference 10 July 2014
Vitae tomorrows-researchers
Meeting the Research Data Management Challenge - Rachel Bruce, Kevin Ashley, ...
Ad

Viewers also liked (19)

PDF
OpenAIRE National ePosters (East region) - OpenAIRE Conference 2012
PPTX
Developing data services: a tale from two Oregon universities
PPTX
The Evolving Role of the Library in Institutional and Faculty Assessment
PPTX
Supporting the Patron Research Life Cycle
PPTX
Confessions of an ex-librarian: research support across divisional borders
PPT
The Changing nature of scholarly communication: what does this mean for resea...
PPTX
Supporting research life cycle librarians
PDF
From research life cycle to networks: The role of the library
PPTX
Introduction to research data management; Lecture 01 for GRAD521
PPTX
Introduction to Research Data Management - 2015-05-27 - Social Sciences Divis...
PPTX
African solutions to African problems: the role of research management tools ...
PDF
Library Data Management Services
PDF
Leading the library of the future: w(h)ither technical services?
PPTX
The Role of Libraries in Data Management and Curation
PPTX
Empowering Library and Information Professionals and Library End-Users of Aca...
PDF
Introduction to research data management
PDF
An internship report on library operations and services of Dhaka University
PPTX
Current trends in library science research
PPTX
Publishing the Full Research Data Lifecycle
OpenAIRE National ePosters (East region) - OpenAIRE Conference 2012
Developing data services: a tale from two Oregon universities
The Evolving Role of the Library in Institutional and Faculty Assessment
Supporting the Patron Research Life Cycle
Confessions of an ex-librarian: research support across divisional borders
The Changing nature of scholarly communication: what does this mean for resea...
Supporting research life cycle librarians
From research life cycle to networks: The role of the library
Introduction to research data management; Lecture 01 for GRAD521
Introduction to Research Data Management - 2015-05-27 - Social Sciences Divis...
African solutions to African problems: the role of research management tools ...
Library Data Management Services
Leading the library of the future: w(h)ither technical services?
The Role of Libraries in Data Management and Curation
Empowering Library and Information Professionals and Library End-Users of Aca...
Introduction to research data management
An internship report on library operations and services of Dhaka University
Current trends in library science research
Publishing the Full Research Data Lifecycle
Ad

Similar to Supporting Libraries in Leading the Way in Research Data Management (20)

PPTX
Big and Small Web Data
PPT
Informatics Transform : Re-engineering Libraries for the Data Decade
PPTX
Managing and Sharing Research Data
PPT
Introduction to Research Data Management for postgraduate students
PPT
Partnering for Research Data
PPTX
Edin casestudy-ou-rr-2011
PPT
Improving Access to Research Data: What does changing legislation mean for y...
PPTX
Carmen O'Dell and Barbara Sen JIBS-RLUK event July 2012
PPT
Managing data throughout the research lifecycle
PDF
Graham Pryor
PPT
University of Bath Research Data Management training for researchers
PPTX
RDM in higher education
PPTX
Engaging the Researcher in RDM
PPTX
Building a business case and institutional policy on a 10Y research data mana...
PPT
Facing the data challenge: Developing data policy & services
PPTX
Research Data Management: a gentle introduction
PPT
EDINA / Data Library Overview
PDF
Va sla nov 15 final
PPT
Survey of research data management practices up2010digschol2011
PPT
Where is the opportunity for libraries in the collaborative data infrastructure?
Big and Small Web Data
Informatics Transform : Re-engineering Libraries for the Data Decade
Managing and Sharing Research Data
Introduction to Research Data Management for postgraduate students
Partnering for Research Data
Edin casestudy-ou-rr-2011
Improving Access to Research Data: What does changing legislation mean for y...
Carmen O'Dell and Barbara Sen JIBS-RLUK event July 2012
Managing data throughout the research lifecycle
Graham Pryor
University of Bath Research Data Management training for researchers
RDM in higher education
Engaging the Researcher in RDM
Building a business case and institutional policy on a 10Y research data mana...
Facing the data challenge: Developing data policy & services
Research Data Management: a gentle introduction
EDINA / Data Library Overview
Va sla nov 15 final
Survey of research data management practices up2010digschol2011
Where is the opportunity for libraries in the collaborative data infrastructure?

More from Marieke Guy (20)

PPTX
ai-proof.pptx
PPTX
Ways to ensure “buy in” from the academics in the transition to digitised ass...
PPTX
Assessing for a World Beyond Assessment
PPTX
The blandness is its formulaic style’: insights to help understand the impact...
PPTX
Redesigning assessments for a world with artificial intelligence
PPTX
Closing remarks: Assessment with Phill Dawson
PPTX
The UCL assessment journey
PPTX
The UCL lockdown browser pilot
PPTX
Assessment in the time of change
PPTX
Digital Assessment Team 2022 - a day in the life.pptx
PPTX
The Digital Assessment Marathon 
PPTX
Inspired assessments
PDF
Designing alternative assessments
PDF
MCQs_ The joys of making your mind up.pdf
PDF
Rubrics_ removing the glitch in the assessment matrix (1).pdf
PPTX
Making your mind up: Formalising the evaluation of learning technologies 
PPTX
Video assessment recipes
PPTX
Alternative assessments
PPTX
Connect more: Digital Culture forum - A thousand things, a thousand times
PPTX
The Certainty of Uncertainty: Transnational Online Pivot in China
ai-proof.pptx
Ways to ensure “buy in” from the academics in the transition to digitised ass...
Assessing for a World Beyond Assessment
The blandness is its formulaic style’: insights to help understand the impact...
Redesigning assessments for a world with artificial intelligence
Closing remarks: Assessment with Phill Dawson
The UCL assessment journey
The UCL lockdown browser pilot
Assessment in the time of change
Digital Assessment Team 2022 - a day in the life.pptx
The Digital Assessment Marathon 
Inspired assessments
Designing alternative assessments
MCQs_ The joys of making your mind up.pdf
Rubrics_ removing the glitch in the assessment matrix (1).pdf
Making your mind up: Formalising the evaluation of learning technologies 
Video assessment recipes
Alternative assessments
Connect more: Digital Culture forum - A thousand things, a thousand times
The Certainty of Uncertainty: Transnational Online Pivot in China

Recently uploaded (20)

PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PPTX
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PPTX
master seminar digital applications in india
PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PDF
01-Introduction-to-Information-Management.pdf
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PDF
TR - Agricultural Crops Production NC III.pdf
PDF
Complications of Minimal Access Surgery at WLH
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PPTX
PPH.pptx obstetrics and gynecology in nursing
Module 4: Burden of Disease Tutorial Slides S2 2025
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
Renaissance Architecture: A Journey from Faith to Humanism
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
O5-L3 Freight Transport Ops (International) V1.pdf
102 student loan defaulters named and shamed – Is someone you know on the list?
2.FourierTransform-ShortQuestionswithAnswers.pdf
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
FourierSeries-QuestionsWithAnswers(Part-A).pdf
master seminar digital applications in india
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
01-Introduction-to-Information-Management.pdf
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
TR - Agricultural Crops Production NC III.pdf
Complications of Minimal Access Surgery at WLH
human mycosis Human fungal infections are called human mycosis..pptx
Supply Chain Operations Speaking Notes -ICLT Program
STATICS OF THE RIGID BODIES Hibbelers.pdf
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PPH.pptx obstetrics and gynecology in nursing

Supporting Libraries in Leading the Way in Research Data Management

  • 1. Supporting Libraries in Leading the Way in Research Data Management Marieke Guy, Institutional Support Officer, Digital Curation Centre, UKOLN, University of Bath, UK Email: m.guy@ukoln.ac.uk Twitter Id: mariekeguy Web: http://www.dcc.ac.uk Online Information, 20th -21st November 2012 UKOLN is supported by: This work is licensed under a Creative Commons Licence Attribution-ShareAlike 2.0 1
  • 2. Who Am I? • Have worked for UKOLN for over 12 years • Worked on variety of projects: Subject portals project, IMPACT, Good APIs, JISC Observatory, cultural heritage work, digital preservation work, …etc • Remote worker, into amplified events • Co-chair of IWMW for a number of years • Now working for Digital Curation Curation • Institutional Support Officer helping HEIs with their RDM 2
  • 3. Today’s Talk • Research data and why is it so important? • How research data is managed • What the DCC does • The role libraries are currently playing • The role libraries could be playing in the future 3
  • 4. http://www.google.co.uk/imgres?q=illumina+bgi&hl=en&client=firefox- a&hs=Jl2&rls=org.mozilla:en-GB:official&biw=1366&bih Research Data http://www.flickr.com/photos/think mulejunk/352387473/ http://www.flickr.com/photos/usf sregion5/4546851916 http://www.flickr.com/photos/wasp http://www.flickr.com/photos/charleswelch/3 _barcode/4793484478/ 4 597432481/
  • 5. What is Research Data? …whatever is produced in research or evidences its outputs • Facts • Statistics • qualitative • quantitative • Not published research output “highest priority research data is that which • Discipline specific 5 underpins a research output”
  • 6. A Data Present “Data underpins our economy and our society - data about how much is being spent and where, data about how schools, hospitals and police are performing, data about where things are and data about the weather.” Tim Berners Lee, director of W3C. 6
  • 7. Big Data • Volume • Velocity • Variety “The 1000 Genomes Project generated more DNA “The 1000 Genomes Project generated more DNA sequence data in its first 6 months than GenBank sequence data in its first 6 months than GenBank 7 had accumulated in its entire 21 year existence” had accumulated in its entire 21 year existence”
  • 8. A Data Future “The ability to take data - to be able to understand it, to process it, to extract value from it, to visualise it, to communicate it -that’s going to be a hugely important skill in the next decades.” Hal Varian, Google’s chief economist. 8 Hal Varian, Chief Economist, Google
  • 9. Big Data…and Small Data • DIY data • Consumer data • Crowd Sourced data • What about Linked data/ Web of data/Open data? • Databases • Learning data • Administrative data • Long tail data r os s project: “ JIS C MaRDI-G ast significant e le ce vo lume is th resent context, sin ep (is sue) in th al problem ” hnic it i s ‘ o nly’ a tec 9
  • 10. Some Data Issues • Scale and complexity – data deluge – volume, pace • Infrastructure and management – Storage, costs & sustainability • Quality of data • Reputation – FOI, DPA, computer misuse • Openness agenda • Preservation • Working in partnerships • Funding for researchers 10
  • 11. Funding…the Biggest Carrot/Stick? EPSRC expects all those institutions it funds: •to develop a roadmap that aligns their policies and processes with EPSRC’s expectations by 1st May 2012; •to be fully compliant with these expectations by 1st May 2015. •http://www.epsrc.ac.uk/about/standards/researchdata/Pages/expectat ions.aspx 11
  • 12. Data Policies of Funders http://www.dcc.ac.uk/resources/policy-and-legal/overview-funders-data-policies 12
  • 13. What is Research Data Management? “the active management and appraisal of data over the lifecycle of scholarly and scientific interest” Data management is part of good research practice 13
  • 14. How is Research Data Managed? Some areas to think about: Leicester University Data management support for researchers Web site • Storage & cloud • Curation • Data repositories • Digital Preservation • Metadata & citation • Migration • File naming • Sharing/openness • Appraisal, selection & • Security deletion • Cost 14
  • 15. RDM Activities What kind of activities are involved? – producing and sharing of data with research colleagues in collaborative environments (internal and external) – file naming – applying metadata for context and discovery – ensuring that sensitive data is not shared or accessible – cleaning data for longer-term use – selecting mechanisms for data capture and storage – selecting and appraising data for short and longer-term retention – licensing data for reuse – developing data management plans •Data management is about making informed decisions 15
  • 16. The Digital Curation Centre • A consortium comprising units from the Universities of Bath (UKOLN), Edinburgh (DCC Centre) and Glasgow (HATII) • launched 1st March 2004 as a national centre for solving challenges in digital curation that could not be tackled by any single institution or discipline • Funded by JISC with additional HEFCE funding from 2011 for the provision of support to national cloud services • Targeted institutional development • http://www.dcc.ac.uk/ 16
  • 17. Advocacy and Training How to… • Appraise and Select Research Data • Cite Datasets and Link to Publications • Develop a Data Management and Sharing Plan • License Research Data • Set a RDM service – coming soon! How to cite data 17
  • 18. 18
  • 19. DCC Tools for Engagement Survey and interview methodology for investigating data holdings and how they are managed Capability model for establishing consensus on capabilities and gaps in current provision, rating organisation, technology and resources Customised institutional templates for data management planning 19
  • 20. Institutional Engagement Work • Funded by the HEFCE through its Universities Modernisation Fund (UMF) • Intensive, tailored support to increase research data management capability • Originally 18 Higher Education Institutions (HEIs) between Summer 2011 and Spring 2013 • Can help: – win the support of senior management – understand current data practices – redesign data support services – Help with policy development and training 20
  • 21. What Part are Libraries Playing? • RDM requires the input of all support services, but libraries are taking the lead in the UK • The library is leading on most of the DCC engagements Other examples include: –EDINA at University of Library Information Edinburgh managementthe a –Bodleian Library at is University of Oxford Research key skill in RDM, so –Subject librarians at Office it’s a major role for University of Southampton IT librarians 21
  • 22. Why are Libraries Taking the Lead? Because libraries: •Often run publication repositories so are the stakeholder called on when questions are raised about the management of associated data •Have directed the open sharing of publications so are well placed to advice on how best to support data requirements •Have good relationships with researchers and good connections with other service departments •Have a highly relevant skill set 22
  • 23. An Exciting Opportunity “Researchers need help to manage their data. This is a really exciting opportunity for libraries….” Liz Lyon, VALA 2012 • Leadership • Providing tools and support • Advocacy and training • Developing data informatics capacity & capability 23
  • 24. Reskilling for Research But librarians feel they lack appropriate skills… Skills gap 2-5 years Now Preserving research outputs 49% 10% Data management & curation 48% 16% Complying with funder mandates 40% 16% Data manipulation tools 34% 7% Data mining 33% 3% Metadata 29% 10% Preservation of project records 24% 3% Sources of research funding 21% 8% Metadata schema, disciplinary standards, practices 16% 2% From RLUK, Re-skilling for Research, Jan 2012, p43 24 Other surveys include DataOne, Cologne Uni, DigCurV
  • 25. Specialist Knowledge “Very few librarians are likely to have specialist scientific or medical knowledge - if you train as a research scientist or a medic, you probably won’t become a librarian.” Mary Auckland: Reskilling for Research 2012, RLUK. 25
  • 26. Knowledge Needed… • Librarians are overtaxed already, lack personal research experience, have little understanding of complexity and scale of issue • Need knowledge and understanding of: – Researchers’ practice and data holdings – Research Councils and funding bodies’ requirements – Disciplinary and/or institutional codes of practice and policies – Existing institutional policies and infrastructure – Reputational risks associated with poor data management – with respect to researchers’ reputations as well as that of their institutions – Data management and sharing benefits – Research data management tools and technologies 26
  • 27. And Needed Fast… Implications of “Big Data” and data science for organisations in all sectors McKinsey Global Institute predicts a shortage of 190,000 data scientists by 2019 http://www.mckinsey.com/Insights/MGI/Research/Technology_and_Inno 27 vation/Big_data_The_next_frontier_for_innovation
  • 28. Is Retooling Possible? “Significant mismatches exist between research data and library digital warehouses, as well as the processes and procedures librarians typically use to fill those warehouses. Repurposing warehouses and staff for research data is therefore neither straightforward nor simple; in some cases, it may even prove impossible.” Salo, D. (2010) Retooling Libraries for the Data Challenge, Ariadne, Issue 64. • Libraries are organised, research data isn’t • Need technical systems such as sheer curation, better sharing of data and improved funding models 28
  • 29. Possible Approaches • University of Helsinki Library – Knotworking “collaborative performance between otherwise loosely connected actors and activity systems” • University Burnaby, British Columbia - providing research data services since the 1970s – currently exploring funding gaps • Deutsche Nationalbibliothek - DP4lib project (Digital Preservation for libraries) where the library is acting as a service-broker for digital data curation • Research libraries - Opportunities for Data Exchange (ODE) project as an exemplar project, which gives shares emerging best practice • Data intelligence 4 librarians, Delft University of Technology 29
  • 30. Training Librarians: RDMRose • JISC funded project to produce OER learning materials in RDM tailored for Information professionals • Led by Sheffield University iSchool • Practitioner community based on the White Rose University Consortium’s libraries at the Universities of Leeds, Sheffield and York • Deliverables include curriculum, module within taught masters course in Sheffield, self study version • Much of course concentrates on teaching librarians about research and the research process • RDMRose working with Stephen Pinfield on a web-based survey of current library RDM activity • http://www.sheffield.ac.uk/is/research/projects 30
  • 31. Informatics Transform • Library & institutional stakeholders • Roles (7 listed): Responsibilities, Requirements, Relationships 1. Director IS/CIO/University Librarian 2. Data librarians /data scientist / liaison/subject/faculty librarians 3. Repository managers 4. IT/Computing Services 5. Research Support/Innovation Office 6. Doctoral Training Centres 7. PVC Research Liz Lyon, Informatics Transform, Ariadne Issue 68, 2012 31
  • 32. Partnership Approaches • Research 360, University of Bath: • UKOLN-DCC • Library • IT services • Research Support Office • Doctoral Training Centres http://blogs.bath.ac.uk/research360/ 32
  • 33. Embedded Librarians “Librarians may need to raise their profile, become ‘researchers’ themselves; getting embedded in the research community; gaining credibility; and collaborating as equals.” Bent et al, 'Information literacy in a researcher's learning life' in New Review of Information Networking, 13 (2), 2007 33
  • 34. So What Next? • Address the lack of data informatics skills • Mainstream data librarians & data scientists • Embed new skills into LIS & iSchool curriculum Lyon, ‘The Informatics Transform: re-engineering libraries for the data decade’ in IJDC, 7(1), 2012 hts the an ag ement highlig bracing Re search data m n’s skillset. By em licability of the libraria t, librarians will ap p u ppo r p rovide RDM s d as . the need to of institutional agen remain at the heart 34
  • 35. Resources to Look at… • Riding the Wave report and many others emphasise the relevance of research data to current academic working • RLUK/Mary Auckland: Reskilling for Research • Sheila Corrall: Libraries, Librarians and Data • DigCurv • Book: Managing Research Data • HEIs research data support pages 35
  • 36. Thank You • Thanks to DCC colleagues for contributing to slide material. Any questions? m.guy@ukoln.ac.uk 36

Editor's Notes

  • #3: Bath – research led uni – one of top 10 in UK
  • #4: Will think about what research data is and it’s importance Later look at where libraries fit into the picture
  • #5: Gene sequencing machines at the Beijing genomics institute – one of largest in worls– crunching out data 24/7 Air Quality Monitoring on Sierra National Forest Lovell Radio Telescope – scanning the night sky Some facts about the Lovell Radio Telescope.
Mass of telescope 3200 tonnes.
Mass of bowl 1500 tonnes.
Diameter of bowl 76.2 metres.
Maximum height above ground 89 metres. Very impressive.
The Lovell Radio Telescope at Jodrell Bank dominates the Cheshire countryside. WDI4500 2D Barcode scanner w/ barcodes Wasp WDI4500 2D barcode scanner launched July 2010 can scan 2D barcodes, 1D (linear) barcodes and postal barcodes. www.waspbarcode.com/scanners/wdi4500_barcode_scanner.asp National Gene Bank in Shenzhen, China Sensor equipment to monitor air quality in desert Streaming data to our desktops
  • #8: large hadron collider the world's largest and highest-energy particle accelerator. Based at CERN Big data is data that exceeds the processing capacity of conventional database systems. The data is too big, moves too fast, or doesn’t fit the strictures of your database architectures. To gain value from this data, you must choose an alternative way to process it. Systems in place like Hadoop
  • #10: e Apache Hadoop Big Data Platform The MaRDI - Gross project is funded by JISC to support big-science projects in developing suitable Data Management and Preservation (DMP) plans University of Glasgow & University of Lancaster
  • #11: Queen's University in Belfast has been told by the Information Commissioner to hand over 40 years of research data on tree rings, used for climate research. Douglas Keenan, from London, had asked for the information in 2007 under the Freedom of Information Act. Mr Keenan is well-known for his questioning of scientists who propose a human cause for climate change. Queen's University refused his request saying it was too expensive, but it is now considering its position. The university claimed that as the information was unfinished, had intellectual property rights and was commercially confidential information, it did not have to pass it on. Philip Morris tobacco company wanted info on children and smoking. Refused in end – retracted claim.
  • #22: An interesting trend to emerge is who is addressing RDM within the unis. The library is leading in most cases and is involved regardless of who ’ s championing the cause. Research offices are often the lead partner – seemingly for strategic reasons of senior buy-in and financial commitment. IT are only leading in 2 out of the 20 cases and are disengaged / absent in a few others.
  • #25: There are nine areas where over 50% of the respondents with Subject Librarian responsibilities indicated that they have limited or no skills or knowledge, and in all cases these were also deemed to be of increasing importance in the future. These are listed in order of the importance in 2 - 5 years that respondents placed on them.
  • #30: Knotworking is characterised by collaborative performance between otherwise loosely connected actors and activity systems. The idea is to gather experts for a short period of time to solve a specific problem in the academic library. It is clear that librarians cannot be committed to a single research group because there are not enough librarians about and the work is very demanding. So at Helsinki University they have created a new model of customised and standardised services, which are currently at implementation stage, For example the librarians have created a quick reference guide for one research group on how to handle research data management. Research data management is something that librarians have only recently become involved with and they still have much to learn. The system of knotworking was bringing research groups back to the library and generating new demanding services.