SlideShare a Scribd company logo
Developing Infrastructure to Support Closer
Collaboration of Aggregators with Open
Repositories
Dr. Nancy Pontika & Dr. Petr Knoth
COnnecting Repositories (CORE)
Open University, UK
LIBER 2015, 24 – 26 June, London
Mission of CORE
Aggregate all open access content distributed
across different systems worldwide, enrich this
content and provide access to it through a set of
services …
[Source: http://core.ac.uk/about#mission]
Need for a UK aggregator
Bringing the UK’s open access
research outputs together:
• Feasibility study commissioned
by Jisc, published June 2014
• Referred to as “Open Mirror”
[Source :
https://repository.jisc.ac.uk/5570/1/JISC_REPORT_open_mirror_09051
4_FINAL_WEB.pdf]
Three levels of support
Programmable
Data Access
- CORE API
- CORE Data Dumps
- Researchers
- Developers
- Companies
Transaction
Information
Access
- CORE Portal
- CORE Mobile
- CORE Plugin
- Researchers
- Students
- Life long learners
Analytical
Information
Access
- CORE Policy
-CORE Compliance
Analytics
- CORE Dashboard
- Funders
- Governments
- Data Providers
[Source: http://www.dlib.org/dlib/november12/knoth/11knoth.html]
CORE Statistics
• Content: 20M+ records, 600+ repositories, 1.8M+
full-texts
• The UK national aggregator - Jisc
• Full-text aggregator (not just metadata)
• Placed among Top 10 search engines for research
that go beyond Google [Jisc, 2013]
• Listed among Top 100 Thesis and Dissertation
Resources
• Part of Jisc’s Repositories Shared Services Project
(RSSP)
Aggregation process
• Metadata download, extraction and cleaning
• Full-text harvesting
• Text extraction
• Language detection
• Extraction of citation references from text
• Identification of related content
• Detection of duplicate items
• Parsing of author names
• Indexing
CORE Applications
• CORE Portal
– Search engine providing open access content
• CORE Mobile
– Android and iOS apps
• CORE Plugin
– For repositories and journals
• CORE API
– Programmable access to million of resources
• CORE Dashboard
– Tool for repository managers
CORE Dashboard : purpose
• Harvested
Records
• Metadata
• Harvesting
Process
• Standards
• Repository
Managers
• Funders
• Repositories
• Journals
Data
Providers
Collaboration
QualityTransparency
Institution main page
Edit repository information
Invitations
Content
Manage record visibility status
Take down
Manage record visibility status
Take down
Manage record visibility status
Take down
Take up
Manage record visibility status
Take down
Take up
Update metadata records
• Asynchronous process
• Item is queued in the CORE system
• Record is updated within 12 hours
Statistics
Issues : 3 types
When harvesting your repository/document we encountered an error that we couldn't
resolve. These errors need to be fixed in order to to harvest your repository/document.
We encountered an error but we were still able to harvest the repository/document. We
strongly recommend that these issues are resolved as they may lead to incompatibility
problems in the future.
This may not be a problem but it may be a clue for misconfiguration or future
incompatibilities.
Issues : good news
Issues : good news
Issues : bad news…
Issues: Robots.txt
Issues: Robots.txt
Issues: Document Issues
Issues: Malformed PDF url
Dashboard benefits
- Increased and simplified collaboration between
aggregators and content providers
- Improved control of the content provider over
the harvested content
- Reduction of scepticism and fear of sharing
content with other systems
- Improvement of the harvesting process
- Broadening of the open access content
discoverability and thus reuse of the open access
content where permitted
Would you like to take a look?
Dashboard still in BETA but we welcome
volunteer testers
Email me at nancy.pontika[at]open.ac.uk
Many thanks to…
CORE developers:
• Matteo Cancellieri
• Samuel Pearce
• Drahomira Herrmannova
• Lucas Anastasiou
Volunteer testers:
• Chris Biggs, Metadata & Repository Specialist, Open University
• Nick Sheppard, Repository Developer, Leeds Beckett University
Thank you
Questions
CORE Contacts:
Nancy Pontika nancy.pontika[at]open.ac.uk
Petr Knoth petr.knoth[at]open.ac.uk
Website: http://core.ac.uk
Twitter: @oacore

More Related Content

PPTX
Integration - the heart of researcher centric research data management system...
PPTX
Repositories for OA, RDM and Beyond - Rory McNicholl
PPTX
UKSG Conference 2017 Breakout - KBART recommendations: challenges and achieve...
PPT
UKLA Content Development
PPT
PEPRS and the Keepers Registry
PPTX
UKSG Conference 2017 Breakout - KBART recommendations: challenges and achieve...
PPTX
Building data networks: exploring trust and interoperability between authoris...
PPT
Piloting an E-journals Preservation Registry Service: overview of PEPRS
Integration - the heart of researcher centric research data management system...
Repositories for OA, RDM and Beyond - Rory McNicholl
UKSG Conference 2017 Breakout - KBART recommendations: challenges and achieve...
UKLA Content Development
PEPRS and the Keepers Registry
UKSG Conference 2017 Breakout - KBART recommendations: challenges and achieve...
Building data networks: exploring trust and interoperability between authoris...
Piloting an E-journals Preservation Registry Service: overview of PEPRS

What's hot (20)

PPTX
Ensuring the Integrity (& Continuity) of Our Record of Scholarship
PDF
Rusbridge Feb 8 Improving Clarity around Continuing Access
PPTX
Integrating figshare into our RDM workflow: University of Salford
PPTX
SCURL and SUNCAT serials holdings comparison service
PDF
January 13, 2016 NISO Webinar: Ensuring the Scholarly Record: Scholarly Retra...
PPT
UKLA Update On Activities
PPTX
What does Open Science, Open Scholarship look like?
PPT
UKSG Transfer Update (2011 CrossRef Workshops)
PPTX
The Scholix Framework and the OpenAIRE Scholexplorer Service (OpenAIRE webina...
PPTX
Patham "NISO-ODI (Open Discovery Initiative) Standards Update"
PDF
Taylor Delivering faster and more effective research discovery
PDF
MIT Libraries Dataverse by Katherine McNeill
PDF
OpenAIRE Guidelines for data providers: new Metadata Application Profile for ...
PPTX
Jisc on repositories unleashing data - Daniela Duca
PDF
OAI-PMH for dummies: how to build an institutional repository with limited re...
PPTX
Manage it locally to share it globally: RDM and Wikimedia Commons
PPTX
CORE Repositories Dashboard
PPT
Aggregation as Tactic
PPTX
Managing active research in the University of Edinburgh
PDF
OpenAIRE Metrics Service: Usage Statistics (webinar for repository managers)
Ensuring the Integrity (& Continuity) of Our Record of Scholarship
Rusbridge Feb 8 Improving Clarity around Continuing Access
Integrating figshare into our RDM workflow: University of Salford
SCURL and SUNCAT serials holdings comparison service
January 13, 2016 NISO Webinar: Ensuring the Scholarly Record: Scholarly Retra...
UKLA Update On Activities
What does Open Science, Open Scholarship look like?
UKSG Transfer Update (2011 CrossRef Workshops)
The Scholix Framework and the OpenAIRE Scholexplorer Service (OpenAIRE webina...
Patham "NISO-ODI (Open Discovery Initiative) Standards Update"
Taylor Delivering faster and more effective research discovery
MIT Libraries Dataverse by Katherine McNeill
OpenAIRE Guidelines for data providers: new Metadata Application Profile for ...
Jisc on repositories unleashing data - Daniela Duca
OAI-PMH for dummies: how to build an institutional repository with limited re...
Manage it locally to share it globally: RDM and Wikimedia Commons
CORE Repositories Dashboard
Aggregation as Tactic
Managing active research in the University of Edinburgh
OpenAIRE Metrics Service: Usage Statistics (webinar for repository managers)
Ad

Similar to Developing Infrastructure to Support Closer Collaboration of Aggregators with Open Repositories (20)

PPTX
Repository Fringe 2015 - Jisc RDM Session, Linda Naughton, Jisc
PPT
Digitisation and institutional repositories 2
PPTX
Using Archivemedia to preserve research data
PDF
7th Content Providers Community Call
PPTX
Core @ repositories fringe 2015
PPTX
Perspectives on the Role of Trustworthy Repository Standards in Data Journal ...
PPTX
Uc3 pasig-asis&t-2013-08-20-support-of-data-intensive-research
PPTX
Research Data Management at the University of Salford
PPTX
Publishing the Full Research Data Lifecycle
PDF
Harvesting Repositories: DPLA, Europeana, & Other Case Studies
PPTX
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
PPTX
Adam Rusbridge (EDINA) - Clarifying e-journal subscription history
PPTX
OpenAIRE guidelines and broker service for repository managers - OpenAIRE #OA...
PPTX
Scholze liber 2015-06-25_final
PPTX
Research Data, or: How I Learned to Stop Worrying and Love the Policy
PDF
Uk Research Infrastructure Workshop E-infrastructure Juan Bicarregui
PPTX
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...
PPTX
Architecture and Standards
PPTX
Linked Open Data_mlanet13
PDF
The state of global research data initiatives: observations from a life on th...
Repository Fringe 2015 - Jisc RDM Session, Linda Naughton, Jisc
Digitisation and institutional repositories 2
Using Archivemedia to preserve research data
7th Content Providers Community Call
Core @ repositories fringe 2015
Perspectives on the Role of Trustworthy Repository Standards in Data Journal ...
Uc3 pasig-asis&t-2013-08-20-support-of-data-intensive-research
Research Data Management at the University of Salford
Publishing the Full Research Data Lifecycle
Harvesting Repositories: DPLA, Europeana, & Other Case Studies
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Adam Rusbridge (EDINA) - Clarifying e-journal subscription history
OpenAIRE guidelines and broker service for repository managers - OpenAIRE #OA...
Scholze liber 2015-06-25_final
Research Data, or: How I Learned to Stop Worrying and Love the Policy
Uk Research Infrastructure Workshop E-infrastructure Juan Bicarregui
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...
Architecture and Standards
Linked Open Data_mlanet13
The state of global research data initiatives: observations from a life on th...
Ad

More from Nancy Pontika (20)

PPTX
Closing the scientific literature access gap with CORE - how to gain free acc...
PPTX
The future of scholarly communications professionals
PPTX
CORE: Recommender and Publisher Connector
PPTX
CORE Recommender: a plug in suggesting open access content
PPTX
General introduction to Open Data Policies H2020, influence of OD policies on...
PPTX
Open Science: Tools and platforms
PPTX
Understanding Open Science: Definitions and framework
PPTX
What is Open Science
PPTX
Open Science, Why not?
PPTX
How can repositories support the text-mining of their content and why?
PPTX
Implementation of the RIOXX Metadata Guidelines in the UK's repositories thro...
PPTX
Open Science: Application and Benefits
PPTX
Open Access: funders' policies and recent updates
PPTX
Fostering Open Science to Research Using a Taxonomy and an eLearning Portal
PPTX
Benefits of Open Access to Early Career Researchers
PPTX
What young researchers can do to promote open access
PDF
Putting Open Access into Practice
PDF
Reusing Open Access content & HEFCE policy on Open Access
PPTX
REF2020 and Open Access : How to comply?
PPTX
Managing Open Access in the Library
Closing the scientific literature access gap with CORE - how to gain free acc...
The future of scholarly communications professionals
CORE: Recommender and Publisher Connector
CORE Recommender: a plug in suggesting open access content
General introduction to Open Data Policies H2020, influence of OD policies on...
Open Science: Tools and platforms
Understanding Open Science: Definitions and framework
What is Open Science
Open Science, Why not?
How can repositories support the text-mining of their content and why?
Implementation of the RIOXX Metadata Guidelines in the UK's repositories thro...
Open Science: Application and Benefits
Open Access: funders' policies and recent updates
Fostering Open Science to Research Using a Taxonomy and an eLearning Portal
Benefits of Open Access to Early Career Researchers
What young researchers can do to promote open access
Putting Open Access into Practice
Reusing Open Access content & HEFCE policy on Open Access
REF2020 and Open Access : How to comply?
Managing Open Access in the Library

Recently uploaded (20)

PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PDF
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
PDF
RTP_AR_KS1_Tutor's Guide_English [FOR REPRODUCTION].pdf
PPTX
Orientation - ARALprogram of Deped to the Parents.pptx
PPTX
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PPTX
Lesson notes of climatology university.
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PDF
RMMM.pdf make it easy to upload and study
PDF
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PPTX
202450812 BayCHI UCSC-SV 20250812 v17.pptx
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PPTX
Cell Types and Its function , kingdom of life
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PDF
Computing-Curriculum for Schools in Ghana
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
Supply Chain Operations Speaking Notes -ICLT Program
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
RTP_AR_KS1_Tutor's Guide_English [FOR REPRODUCTION].pdf
Orientation - ARALprogram of Deped to the Parents.pptx
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE
human mycosis Human fungal infections are called human mycosis..pptx
STATICS OF THE RIGID BODIES Hibbelers.pdf
Lesson notes of climatology university.
Final Presentation General Medicine 03-08-2024.pptx
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
RMMM.pdf make it easy to upload and study
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
202450812 BayCHI UCSC-SV 20250812 v17.pptx
Abdominal Access Techniques with Prof. Dr. R K Mishra
Cell Types and Its function , kingdom of life
FourierSeries-QuestionsWithAnswers(Part-A).pdf
Computing-Curriculum for Schools in Ghana

Developing Infrastructure to Support Closer Collaboration of Aggregators with Open Repositories

  • 1. Developing Infrastructure to Support Closer Collaboration of Aggregators with Open Repositories Dr. Nancy Pontika & Dr. Petr Knoth COnnecting Repositories (CORE) Open University, UK LIBER 2015, 24 – 26 June, London
  • 2. Mission of CORE Aggregate all open access content distributed across different systems worldwide, enrich this content and provide access to it through a set of services … [Source: http://core.ac.uk/about#mission]
  • 3. Need for a UK aggregator Bringing the UK’s open access research outputs together: • Feasibility study commissioned by Jisc, published June 2014 • Referred to as “Open Mirror” [Source : https://repository.jisc.ac.uk/5570/1/JISC_REPORT_open_mirror_09051 4_FINAL_WEB.pdf]
  • 4. Three levels of support Programmable Data Access - CORE API - CORE Data Dumps - Researchers - Developers - Companies Transaction Information Access - CORE Portal - CORE Mobile - CORE Plugin - Researchers - Students - Life long learners Analytical Information Access - CORE Policy -CORE Compliance Analytics - CORE Dashboard - Funders - Governments - Data Providers [Source: http://www.dlib.org/dlib/november12/knoth/11knoth.html]
  • 5. CORE Statistics • Content: 20M+ records, 600+ repositories, 1.8M+ full-texts • The UK national aggregator - Jisc • Full-text aggregator (not just metadata) • Placed among Top 10 search engines for research that go beyond Google [Jisc, 2013] • Listed among Top 100 Thesis and Dissertation Resources • Part of Jisc’s Repositories Shared Services Project (RSSP)
  • 6. Aggregation process • Metadata download, extraction and cleaning • Full-text harvesting • Text extraction • Language detection • Extraction of citation references from text • Identification of related content • Detection of duplicate items • Parsing of author names • Indexing
  • 7. CORE Applications • CORE Portal – Search engine providing open access content • CORE Mobile – Android and iOS apps • CORE Plugin – For repositories and journals • CORE API – Programmable access to million of resources • CORE Dashboard – Tool for repository managers
  • 8. CORE Dashboard : purpose • Harvested Records • Metadata • Harvesting Process • Standards • Repository Managers • Funders • Repositories • Journals Data Providers Collaboration QualityTransparency
  • 13. Manage record visibility status Take down
  • 14. Manage record visibility status Take down
  • 15. Manage record visibility status Take down Take up
  • 16. Manage record visibility status Take down Take up
  • 17. Update metadata records • Asynchronous process • Item is queued in the CORE system • Record is updated within 12 hours
  • 19. Issues : 3 types When harvesting your repository/document we encountered an error that we couldn't resolve. These errors need to be fixed in order to to harvest your repository/document. We encountered an error but we were still able to harvest the repository/document. We strongly recommend that these issues are resolved as they may lead to incompatibility problems in the future. This may not be a problem but it may be a clue for misconfiguration or future incompatibilities.
  • 22. Issues : bad news…
  • 27. Dashboard benefits - Increased and simplified collaboration between aggregators and content providers - Improved control of the content provider over the harvested content - Reduction of scepticism and fear of sharing content with other systems - Improvement of the harvesting process - Broadening of the open access content discoverability and thus reuse of the open access content where permitted
  • 28. Would you like to take a look? Dashboard still in BETA but we welcome volunteer testers Email me at nancy.pontika[at]open.ac.uk
  • 29. Many thanks to… CORE developers: • Matteo Cancellieri • Samuel Pearce • Drahomira Herrmannova • Lucas Anastasiou Volunteer testers: • Chris Biggs, Metadata & Repository Specialist, Open University • Nick Sheppard, Repository Developer, Leeds Beckett University
  • 30. Thank you Questions CORE Contacts: Nancy Pontika nancy.pontika[at]open.ac.uk Petr Knoth petr.knoth[at]open.ac.uk Website: http://core.ac.uk Twitter: @oacore

Editor's Notes

  • #2: The mission of CORE (COnnecting REpositories) is to aggregate all open access research outputs from repositories and journals worldwide and make them available to the public. In this way CORE facilitates free unrestricted access to research for all. CORE: supports the right of citizens and general public to access the results of research towards which they contributed by paying taxes, facilitates access to open access content for all by offering services to general public, academic institutions, libraries, software developers, researchers, etc., provides support to both content consumers and content providers by working with digital libraries, institutional and subject repositories and journals, enriches the research content using state-of-the-art technology and provides access to it through a set of services including search, API and analytical tools, contributes to a cultural change by promoting open access, a fast growing movement. CORE harvests openly accessible content available with respect to the Budapest Open Access Initiative definition: "By 'open access' to this literature, we mean its free availability on the public internet, permitting any users to read, download, copy, distribute, print, search, or link to the full texts of these articles, crawl them for indexing, pass them as data to software, or use them for any other lawful purpose, without financial, legal, or technical barriers other than those inseparable from gaining access to the internet itself. The only constraint on reproduction and distribution, and the only role for copyright in this domain, should be to give authors control over the integrity of their work and the right to be properly acknowledged and cited."