SlideShare a Scribd company logo
The Data Management
     Ecosystem
              4 April 2013

University of California Curation Center
       California Digital Library
The research data problem

• Journal article               • Research data
  – Uniquely and persistently     – Nope
    identified
  – Concept of “publish”          – Not really

  – Multiple copies               – Typically one

  – Easily findable               – Difficult

  – Services: impact                – Nope
    metrics, citation
    tracking, etc.
                    Research data is seen as a second-
                   class citizen in the scholarly record.
An ecosystem of inter-dependent partners
 Besides data repository and publisher partners...
 • researchers
 • educators
 • citizen science groups
 • funders
 • tenure and promotion committees


  Libraries as neutral connection partners
Where can libraries make a difference?
     Research & Scholarship Lifecycle
               Research


      Save                   Collect
                 Create
               Knowledge

       Share               Publish
Collect > Publish > Share > Save > Research

 Create, edit, share, and save data
                management plans

  Open source curation add-in for
                 Microsoft Excel

       Capture today’s web; build
             tomorrow’s archives
Collect >Publish > Share > Save > Research

     Create and manage persistent
        identifiers: ARKs, DOIs, etc.


An infrastructure to publish and get
    credit for sharing research data
Collect > Publish >Share > Save > Research

                Curation repository:
store, manage, preserve, and share
                       research data
        Open deposit, open access
    repository for spreadsheet data

Data Observation Network for Earth
Collect > Publish > Share > Save >Research

What’s missing to complete the “incentive” circuit?
• Impact measures, citation tracking

    “Connecting the data to the
           research it informs”


Altmetrics tools to measure non-
   traditional products and uses    ,           , etc.
Stable storage: Merritt repository
          • Curation repository open to the UC
            community and beyond
          • Discipline / content agnostic
          • Micro-services architecture
          • Easy-to-use UI or API
          • Hosted or locally deployed
EZID: Long term identifiers made easy
• Precise identification of a
  dataset (DOI or ARK)
• Credit to data producers and
  data publishers
• A link from the traditional
  literature to the data (DataCite)
• Exposure and research metrics
  for datasets
  (Web of Knowledge, Google)

                                      Take control of the
                                      management and distribution
                                      of your research, share and get
                                      credit for it, and build your
                                      reputation through its collection
                                      and documentation
Discovery: DataCiteconsortium
•   Technische Informationsbibliothek         •   Canada Institute for Scientific and
    (TIB), Germany                                Technical Information (CISTI)
                                              •   L’Institut de l’Information Scientifique
•   Australian National Data Service (ANDS)
                                                  et Technique (INIST), France
•   The British Library
                                              •   Library or the ETH Zürich
•   California Digital Library, USA           •   Library of TU Delft, The Netherlands
                                              •   Office of Scientific and Technical
                                                  Information, US Department of Energy
                                              •   Purdue University, USA
                                              •   Technical Information Center of
                                                  Denmark
New distributed framework
    Coordinating Nodes       Flexible, scalable, sustainabl
       Member Nodes
• retain complete metadata
                                       e network
• catalog institutions
   diverse
• subset of all data
• serve local community
• perform basic indexing
• provide resources for
• provide network-wide
managing their data
  services
• ensure data availability
  (preservation)
• provide replication
  services
The rest of the story


        www.cdlib.org/uc3


      John.Kunze@ucop.edu
uc3@ucop.edu for service questions

More Related Content

PDF
Aligning library services with emerging research data needs
PPT
Jisc Mediahub: Preview + The Back Story
PPT
EDINA / Data Library Overview
PPTX
Edina cigs-21-september-2012
PPTX
Research Data Management at the University of Edinburgh
PDF
Tales from the Keepers Registry
PPTX
Data Library Services at the University of Edinburgh
PPT
Harnessing Collective Intelligence for Sustainable Development
Aligning library services with emerging research data needs
Jisc Mediahub: Preview + The Back Story
EDINA / Data Library Overview
Edina cigs-21-september-2012
Research Data Management at the University of Edinburgh
Tales from the Keepers Registry
Data Library Services at the University of Edinburgh
Harnessing Collective Intelligence for Sustainable Development

What's hot (20)

PDF
Recommendation to the EU Hearing on Access to and Preservation of Scientific ...
PPT
Edinburgh DataShare - DSpace for Data
PPT
MANTRA & Open Educational Resources
PPT
Pampel/Bertelnmann/Hobohm: Data Librarianship
PPT
Collaboration to Curation: The High Rise Project meets Edinburgh DataShare
PDF
Ensuring Continuing Access to Online Scholarly Resources
PPTX
Engaging the Researcher in RDM
PPT
Open Repositories and Interoperability Challenges in UK
PPT
PEPRS: Recording The Extent Preserved
PDF
Delivering Postgraduate Training - MANTRA
PPTX
Reference Rot and Linked Data: Threat and Remedy
PPTX
Who is looking after your e-journals?
PPT
A national repository (library?) service for learning materials
PPT
The Development of a Socio-technical infrastructure to support Open Access pu...
PPT
Using a dumb identifier to do smart things
PPTX
Institutional repositories
PPTX
EPSRC research data expectations and research software management
PPTX
Where data and journal content collide: what does it mean to ‘publish your da...
PDF
CAEPIA 2011
Recommendation to the EU Hearing on Access to and Preservation of Scientific ...
Edinburgh DataShare - DSpace for Data
MANTRA & Open Educational Resources
Pampel/Bertelnmann/Hobohm: Data Librarianship
Collaboration to Curation: The High Rise Project meets Edinburgh DataShare
Ensuring Continuing Access to Online Scholarly Resources
Engaging the Researcher in RDM
Open Repositories and Interoperability Challenges in UK
PEPRS: Recording The Extent Preserved
Delivering Postgraduate Training - MANTRA
Reference Rot and Linked Data: Threat and Remedy
Who is looking after your e-journals?
A national repository (library?) service for learning materials
The Development of a Socio-technical infrastructure to support Open Access pu...
Using a dumb identifier to do smart things
Institutional repositories
EPSRC research data expectations and research software management
Where data and journal content collide: what does it mean to ‘publish your da...
CAEPIA 2011
Ad

Viewers also liked (20)

PPTX
Navigating the data management ecosystem - John Kratz
PDF
Navigating the data management ecosystem - Dan Valen
PDF
NISO/DCMI Webinar: International Bibliographic Standards, Linked Data, and th...
PPT
Embrace The Chaos
PDF
NISO/DCMI Webinar: Metadata Harmonization: Making Standards Work Together
PDF
DataUp Presentation at Cal Poly
PDF
Library Linked Data
PPTX
MozCon 2013 Recap - Day One
PDF
Functional and Architectural Requirements for Metadata: Supporting Discovery...
PPTX
DCMI/RDA Task Group Report, DC-2010 Pittsburgh
ZIP
Dagstuhl FOAF history talk
PPTX
NISO/DCMI May 22 Webinar: Semantic Mashups Across Large, Heterogeneous Insti...
PPTX
NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr...
PDF
Unicorns and Other Wild Things
PPTX
NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...
PPTX
NISO/DCMI Webinar: Metadata for Public Sector Administration
PDF
NISO DCMI Webinar bibframe-20130123
PPT
Embedding Linked Data Invisibly into Web Pages: Strategies and Workflows for ...
PPTX
April 24, 2013 NISO/DCMI Webinar: Deployment of RDA (Resource Description and...
PDF
Embedding Linked Data Invisibly into Web Pages: Strategies and Workflows for ...
Navigating the data management ecosystem - John Kratz
Navigating the data management ecosystem - Dan Valen
NISO/DCMI Webinar: International Bibliographic Standards, Linked Data, and th...
Embrace The Chaos
NISO/DCMI Webinar: Metadata Harmonization: Making Standards Work Together
DataUp Presentation at Cal Poly
Library Linked Data
MozCon 2013 Recap - Day One
Functional and Architectural Requirements for Metadata: Supporting Discovery...
DCMI/RDA Task Group Report, DC-2010 Pittsburgh
Dagstuhl FOAF history talk
NISO/DCMI May 22 Webinar: Semantic Mashups Across Large, Heterogeneous Insti...
NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr...
Unicorns and Other Wild Things
NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...
NISO/DCMI Webinar: Metadata for Public Sector Administration
NISO DCMI Webinar bibframe-20130123
Embedding Linked Data Invisibly into Web Pages: Strategies and Workflows for ...
April 24, 2013 NISO/DCMI Webinar: Deployment of RDA (Resource Description and...
Embedding Linked Data Invisibly into Web Pages: Strategies and Workflows for ...
Ad

Similar to The Data Management Ecosystem (20)

PPTX
RDAP13 John Kunze: The Data Management Ecosystem
PDF
Supporting Data-Rich Research on Many Fronts
PPTX
Library Tools Supporting Data-Rich Research
PPTX
DataCite: the Perfect Complement to CrossRef
PPTX
NISO Forum, Denver, Sept. 24, 2012: Scientific discovery and innovation in an...
PDF
IASSIT Kansa Presentation
PPTX
Scalable Identifiers for Natural History Collections
PPT
Where is the opportunity for libraries in the collaborative data infrastructure?
PPTX
ExLibris National Library Meeting @ IFLA-Helsinki - Aug 15th 2012
PPTX
Boundless Opportunity
PPTX
Dataset Metadata, Tools and Approaches for Access and Preservation
PPTX
Lynch & Dirks - Platforms for Open Research - Charleston Conference 2011
PDF
DataCite and its Members: Connecting Research and Identifying Knowledge
PPTX
Michener Plenary PPSR2012
PPTX
Supporting research life cycle librarians
PDF
ESI Supplemental Webinar 2 - DataONE presentation slides
PDF
British Library Datasets Programme 2010
PDF
Claudia Bauzer Medeiros Digital preservation – caring for our data to foster...
PPT
TDWG_2010_Chavan_data_citation
PDF
Dataset Citation and Identification
RDAP13 John Kunze: The Data Management Ecosystem
Supporting Data-Rich Research on Many Fronts
Library Tools Supporting Data-Rich Research
DataCite: the Perfect Complement to CrossRef
NISO Forum, Denver, Sept. 24, 2012: Scientific discovery and innovation in an...
IASSIT Kansa Presentation
Scalable Identifiers for Natural History Collections
Where is the opportunity for libraries in the collaborative data infrastructure?
ExLibris National Library Meeting @ IFLA-Helsinki - Aug 15th 2012
Boundless Opportunity
Dataset Metadata, Tools and Approaches for Access and Preservation
Lynch & Dirks - Platforms for Open Research - Charleston Conference 2011
DataCite and its Members: Connecting Research and Identifying Knowledge
Michener Plenary PPSR2012
Supporting research life cycle librarians
ESI Supplemental Webinar 2 - DataONE presentation slides
British Library Datasets Programme 2010
Claudia Bauzer Medeiros Digital preservation – caring for our data to foster...
TDWG_2010_Chavan_data_citation
Dataset Citation and Identification

More from John Kunze (20)

PDF
DCMI ARK Tutorial 2024.10.20, slides and notes, 120 mins.pdf
PPTX
The YAMZ Metadictionary
PPTX
YAMZ Metadata Vocabulary Builder
PDF
The ARK Alliance: 20 years, 850 institutions, 8.2 billion persistent identifi...
PDF
EZID and N2T at CDL
PDF
YAMZ.net: better, faster, cheaper taxonomy building
PDF
A Vocabulary for Persistence
PDF
Identifiers obey Resolvers not Schemes
PPTX
Names, Things, and Open Identifier Infrastructure: N2T and ARKs
PPTX
ARK identifiers: lessons learnt at BnF: paths forward
PPTX
YAMZ: a cross-domain crowd-sourced metadata vocabulary
PPTX
DataONE Preservation and Metadata Working Group Report 2014
PPTX
Selected Bash shell tricks from Camp CDL breakout group
PDF
Annotating Research Datasets
PPTX
Big Data's Long Tail
PPTX
Pamwg 2012ahm
PDF
Future-Proofing the Web: What We Can Do Today
PDF
The ARK Identifier Scheme at Ten Years Old
PDF
New Metaphors: Data Papers and Data Citations
PDF
Pairtrees for object storage
DCMI ARK Tutorial 2024.10.20, slides and notes, 120 mins.pdf
The YAMZ Metadictionary
YAMZ Metadata Vocabulary Builder
The ARK Alliance: 20 years, 850 institutions, 8.2 billion persistent identifi...
EZID and N2T at CDL
YAMZ.net: better, faster, cheaper taxonomy building
A Vocabulary for Persistence
Identifiers obey Resolvers not Schemes
Names, Things, and Open Identifier Infrastructure: N2T and ARKs
ARK identifiers: lessons learnt at BnF: paths forward
YAMZ: a cross-domain crowd-sourced metadata vocabulary
DataONE Preservation and Metadata Working Group Report 2014
Selected Bash shell tricks from Camp CDL breakout group
Annotating Research Datasets
Big Data's Long Tail
Pamwg 2012ahm
Future-Proofing the Web: What We Can Do Today
The ARK Identifier Scheme at Ten Years Old
New Metaphors: Data Papers and Data Citations
Pairtrees for object storage

The Data Management Ecosystem

  • 1. The Data Management Ecosystem 4 April 2013 University of California Curation Center California Digital Library
  • 2. The research data problem • Journal article • Research data – Uniquely and persistently – Nope identified – Concept of “publish” – Not really – Multiple copies – Typically one – Easily findable – Difficult – Services: impact – Nope metrics, citation tracking, etc. Research data is seen as a second- class citizen in the scholarly record.
  • 3. An ecosystem of inter-dependent partners Besides data repository and publisher partners... • researchers • educators • citizen science groups • funders • tenure and promotion committees Libraries as neutral connection partners
  • 4. Where can libraries make a difference? Research & Scholarship Lifecycle Research Save Collect Create Knowledge Share Publish
  • 5. Collect > Publish > Share > Save > Research Create, edit, share, and save data management plans Open source curation add-in for Microsoft Excel Capture today’s web; build tomorrow’s archives
  • 6. Collect >Publish > Share > Save > Research Create and manage persistent identifiers: ARKs, DOIs, etc. An infrastructure to publish and get credit for sharing research data
  • 7. Collect > Publish >Share > Save > Research Curation repository: store, manage, preserve, and share research data Open deposit, open access repository for spreadsheet data Data Observation Network for Earth
  • 8. Collect > Publish > Share > Save >Research What’s missing to complete the “incentive” circuit? • Impact measures, citation tracking “Connecting the data to the research it informs” Altmetrics tools to measure non- traditional products and uses , , etc.
  • 9. Stable storage: Merritt repository • Curation repository open to the UC community and beyond • Discipline / content agnostic • Micro-services architecture • Easy-to-use UI or API • Hosted or locally deployed
  • 10. EZID: Long term identifiers made easy • Precise identification of a dataset (DOI or ARK) • Credit to data producers and data publishers • A link from the traditional literature to the data (DataCite) • Exposure and research metrics for datasets (Web of Knowledge, Google) Take control of the management and distribution of your research, share and get credit for it, and build your reputation through its collection and documentation
  • 11. Discovery: DataCiteconsortium • Technische Informationsbibliothek • Canada Institute for Scientific and (TIB), Germany Technical Information (CISTI) • L’Institut de l’Information Scientifique • Australian National Data Service (ANDS) et Technique (INIST), France • The British Library • Library or the ETH Zürich • California Digital Library, USA • Library of TU Delft, The Netherlands • Office of Scientific and Technical Information, US Department of Energy • Purdue University, USA • Technical Information Center of Denmark
  • 12. New distributed framework Coordinating Nodes Flexible, scalable, sustainabl Member Nodes • retain complete metadata e network • catalog institutions diverse • subset of all data • serve local community • perform basic indexing • provide resources for • provide network-wide managing their data services • ensure data availability (preservation) • provide replication services
  • 13. The rest of the story www.cdlib.org/uc3 John.Kunze@ucop.edu uc3@ucop.edu for service questions

Editor's Notes

  • #2: Panel: Partnerships between institutional repositories, domain repositories, and publishers20-25 mins, 9:30-11amThe 'data management ecosystem' angle seems appropriate for the panel, but feel free to share some of the technical aspects with the audience, too.partnerships via conventions and APIs. Data Citation conventions, Libraries are chipping away on several fronts to try to shrink this "data curation" problem to a more manageable size, and they are offering a great deal of support for data management planning, data citation, identifier and repository services,repository federation, and “data publication”.
  • #4: Research data can be seen to fit in a kind of ecosystem of inter-dependent stakeholder niches. Each niche depends on other niches.In a broad sense, partnerships are about dependencies. Besides explicit partnerships between publishers and institutional and domain repositories, there are other critical inter-dependencies – essentially implicit partnerships.Libraries as neutral connectors to sub-partners insystem development and collection buildinglinking with museums and archives
  • #6: Development partners:DMPTool: U Va, Smithsonian, DCC, et alDataUp: MSRC, GBMF, D1 WAS: LC, UNT, NYU, et alUser partners (clients, patrons, customers): any
  • #7: Partners: JISC/EDINA, paying customers on two continents
  • #8: D1 network partners all over the world
  • #10: partnering with escholarship and UC campuses for collection building
  • #11: Partnering with JISC/EDINA, DataCite, the Research Data Alliance
  • #12: Each member partners with regional data repositoriesDataCite partners with publishers (eg, T-R) for data citation indexCreditDiscoveryImpact trackingHelping data authors verify use of their data andHelping identify how others have used the dataWith archiving: re-use and reproducibility