SlideShare a Scribd company logo
WebArchiving at the NLI
1877
2018
2019
The National Library of Ireland
• Special
Collections
• Published
Collections
• Digital
Collections
• Development Office
• Education, Learning
and Programming
• Estates
• Administration
Digital Collections at the NLI
• Digitsation
• Born Digital Pilots
• Web Archive
• IT infrastructure
• Digital Preservation
From International Internet Preservation Consortium (IIPC)
Web archiving is the process of collecting portions of the
World Wide Web, preserving the collections in an archival
format, and then serving the archives for access and use.
What is Web Archiving?
What Web Archiving is Not…..
• Search engine indexing
• Bookmarking
• Cataloguing a live website
• Downloading or saving a file or page
• Screen recording
• Screen shots
Why do we archive the web?
• Changes quickly
• An information resource
• Rich documentation of culture
• Accountability
• Our mission
The mission of the Library is to
collect, preserve, promote and
make accessible the documentary
and intellectual record of the life of
Ireland.
Web Archiving at the NLI
• Thematic & Selective
• Event Based
• Rapid Response
• Technical Partner
• Legal & Technical Restrictions
• Collaborative (where possible)
An Overview
Selection
How do we decide what to collect?
• Hard choices to make
• Is it at risk?
• Does it fit with the NLI CDP?
• Can it be archived?
• Who owns the website?
• Available resources?
• Expectation of inclusion?
Themes, Events & Topics
What kind of Collections do we have?
• Politics
• All Referendums & Elections
• Society
• Trade Union
• Rural Life
• Cultural & Creative
• Irish Language
• Literature
Web Archiving- The Process
Access
Crawling
Notification
Selection
Quality
Assurance
Collecting in 2020
General Election 2020
The “Big” Event
• 14th January
• Saturday 8th February
• Fianna Fáil- 38 seats
• Sinn Féin-37 seats
• Fine Gael- 35 seats
• June 2020- Formation of the 33rd
Dáil
General Election 2020
Resulting Collection
• All Political Parties
• A sample of candidate websites
• Retiring TDs
• Outgoing Cabinet
• Media & Commentary
• Representative & Advocacy Groups
From the journal.ie Feb 11th, General Election Collection
What we thought was the biggest
news story of the year
Collecting in a Pandemic
• Collecting begins in February
• March 12th- The NLI closes
• Web Archive switches to Working
From Home
• New approach needed
Initial Response
• Identify primary online information
sources
• Rapidly changing
• What are the most important
websites to capture?
• How frequently do we archive?
First Steps: Frequent Archiving
• Health Service Executive (HSE)
• Health Protection Surveillance Centre (HPSC)
• Gov.ie
• The Journal.ie
• Tuairisc
• RTÉ News
1. Weekly
2. Fortnightly
3. Each phase
Archived
Resulting Collection
• Health Sector
• Government Bodies
• Education
• Hospitality & Tourism
• Charities
Almost 3TB of data
Uses for the Web Archive
• Amateur researchers
• New opportunities
• Cross-disciplinary
• Big data
• New tools & methods
Explore the
web archive
from NLI.ie
Thank You
webarchives@nli.ie

More Related Content

PPT
IWMW 2006: Archiving the Web What can Institutions learn from National and In...
PPTX
Cultivating ORCID: five years of growth and planting new seeds
PDF
General meeting HURIDOCS 2009-2014
PDF
OECD library themes
PPTX
Working with Personal and Sensitive Research Data 12/11/20
PPTX
An update on ORCID
PDF
General meeting 2009-2014 HURIDOCS
PPTX
GREEN Legal deposit sweden
IWMW 2006: Archiving the Web What can Institutions learn from National and In...
Cultivating ORCID: five years of growth and planting new seeds
General meeting HURIDOCS 2009-2014
OECD library themes
Working with Personal and Sensitive Research Data 12/11/20
An update on ORCID
General meeting 2009-2014 HURIDOCS
GREEN Legal deposit sweden

What's hot (20)

PPT
Sanja Halling, Rolf Källman Digital preservation CIDOC 2014
PPTX
Host in ireland nyc event ppt 19-5-2014 _final
PPTX
Our ORCID journey - University of Liverpool
PPT
Rolf källman Digisam Swedish National coordination CIDOC 11 juni 2012
PPT
Rolf Källman Digisam Swedish National Coordination -Cultural heritage online_...
PDF
Open access building block in development
PDF
EASO: Information Handling tools and techniques for COI research
PPTX
Diana Edmonds Presentation - ECEI14
PDF
EU & Cookies. Prawo w analityce webowej - Geddy Van Elburg
PPTX
WAPWG Jan 2020 Rossi
PPT
Permanent access to digital material
PPT
Rolf Källman Models for national collaboration Vancouver sept 2012
PDF
10 Smart Cities Asta Manninen
PPTX
Work Matters at the Library - Support for Business and Employment
PPTX
Tim Martin - strategic perspectives on digital preservation from OCLC
PPTX
'Building the Legal Deposit E-Journal Archive for the UK' by Andrew MacEwan
PDF
Session1.3 pp10 audrey onillon_wp3
PPTX
ORCID - A look back to 2020 and goals for 2021
PDF
Blurred lines and how standards help refocus_UN CEFACT_SWC2016
Sanja Halling, Rolf Källman Digital preservation CIDOC 2014
Host in ireland nyc event ppt 19-5-2014 _final
Our ORCID journey - University of Liverpool
Rolf källman Digisam Swedish National coordination CIDOC 11 juni 2012
Rolf Källman Digisam Swedish National Coordination -Cultural heritage online_...
Open access building block in development
EASO: Information Handling tools and techniques for COI research
Diana Edmonds Presentation - ECEI14
EU & Cookies. Prawo w analityce webowej - Geddy Van Elburg
WAPWG Jan 2020 Rossi
Permanent access to digital material
Rolf Källman Models for national collaboration Vancouver sept 2012
10 Smart Cities Asta Manninen
Work Matters at the Library - Support for Business and Employment
Tim Martin - strategic perspectives on digital preservation from OCLC
'Building the Legal Deposit E-Journal Archive for the UK' by Andrew MacEwan
Session1.3 pp10 audrey onillon_wp3
ORCID - A look back to 2020 and goals for 2021
Blurred lines and how standards help refocus_UN CEFACT_SWC2016
Ad

Similar to Web Archiving at the NLI (20)

PPTX
‘Born in the USB: Digital collecting at the National Library of Ireland’ - De...
PPTX
Livin’ In The Future – The National Library of Ireland’s Web Archive -Maria R...
PPT
How to Face the Challenges of Web Archiving? The Experiences of a Small Libra...
PDF
20221018_Panel_Covid_WARCnet_closing_conference.pdf
PPTX
Durham University Historic Collections for Researchers 2014
PDF
Archives Recordkeeping And Social Justice 1st Edition David A Wallace
PPTX
An Introduction to the Digital Repository of Ireland
PPTX
Historic collections for researchers (November 2013)
PDF
Bendavid unpacking archival_silences_guest_lecture_18022013
PPT
SELECTION for Web Archiving Programme
PPTX
SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...
PPT
keynote nick kingslay
PPT
Introduction to British Library digital resources for social scientists
PDF
Leslie Johnston: Big Data at Libraries, Georgetown University Law School Symp...
PDF
The meaning and value of web archives for research
PPT
Archiving and Preserving Born Digital Government Documents
PPT
Collection Policies
PPTX
Diversity and Inclusivity at the National Library of Ireland Maria Ryan, Joan...
PPTX
Boiling the Ocean, Together: Web Archive Collection Development in a Global C...
‘Born in the USB: Digital collecting at the National Library of Ireland’ - De...
Livin’ In The Future – The National Library of Ireland’s Web Archive -Maria R...
How to Face the Challenges of Web Archiving? The Experiences of a Small Libra...
20221018_Panel_Covid_WARCnet_closing_conference.pdf
Durham University Historic Collections for Researchers 2014
Archives Recordkeeping And Social Justice 1st Edition David A Wallace
An Introduction to the Digital Repository of Ireland
Historic collections for researchers (November 2013)
Bendavid unpacking archival_silences_guest_lecture_18022013
SELECTION for Web Archiving Programme
SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...
keynote nick kingslay
Introduction to British Library digital resources for social scientists
Leslie Johnston: Big Data at Libraries, Georgetown University Law School Symp...
The meaning and value of web archives for research
Archiving and Preserving Born Digital Government Documents
Collection Policies
Diversity and Inclusivity at the National Library of Ireland Maria Ryan, Joan...
Boiling the Ocean, Together: Web Archive Collection Development in a Global C...
Ad

More from dri_ireland (20)

PDF
20250409 RDA National PID Strategies IG.pdf
PPTX
Responsible Use of Research Metrics Module Launch
PDF
'Drafting the Irish PID strategy and Cost benefit Analysis–how, why and lesso...
PDF
'National PID Recommendations & Roadmap' by DR Michelle Doran at the Research...
PPTX
DE-BIASing digital collections through participation: a community engagement ...
PPTX
Queer Possibility in Museums_Margaret Middleton
PPTX
Sharing research data policies with FAIRsharing.pptx
PPTX
DRI at OS200 Launch: Digitally Re-Mapping Ireland's Ordnance Survey Heritage
PDF
NORFest 2023 Lightning Talks Session Two
PDF
NORFest 2023: Early Career Researcher Panel on Research Assessment
PDF
NORFest 2023: National Open Research Fund 2023, Projects Launch
PDF
NORFest 2023 Lightning Talks Session Three
PDF
NORFest 2023 Lightning Talks Session One
PDF
NORFest2023 Keynote address: Chelle Gentemann (NASA)
PPTX
The Archiving Reproductive Health project as a FAIR data resource for humanit...
PPTX
Developing a self-care protocol for working with potentially traumatic data: ...
PPTX
DRI Copyright and Licencing_UCC_Mar23.pptx
PPTX
The Digital Repository of Ireland Digital Preservation and Research Sustainab...
PPTX
DRI's role in WorldFAIR: Cultural Heritage / Image Sharing
PPTX
Introduction to research data management
20250409 RDA National PID Strategies IG.pdf
Responsible Use of Research Metrics Module Launch
'Drafting the Irish PID strategy and Cost benefit Analysis–how, why and lesso...
'National PID Recommendations & Roadmap' by DR Michelle Doran at the Research...
DE-BIASing digital collections through participation: a community engagement ...
Queer Possibility in Museums_Margaret Middleton
Sharing research data policies with FAIRsharing.pptx
DRI at OS200 Launch: Digitally Re-Mapping Ireland's Ordnance Survey Heritage
NORFest 2023 Lightning Talks Session Two
NORFest 2023: Early Career Researcher Panel on Research Assessment
NORFest 2023: National Open Research Fund 2023, Projects Launch
NORFest 2023 Lightning Talks Session Three
NORFest 2023 Lightning Talks Session One
NORFest2023 Keynote address: Chelle Gentemann (NASA)
The Archiving Reproductive Health project as a FAIR data resource for humanit...
Developing a self-care protocol for working with potentially traumatic data: ...
DRI Copyright and Licencing_UCC_Mar23.pptx
The Digital Repository of Ireland Digital Preservation and Research Sustainab...
DRI's role in WorldFAIR: Cultural Heritage / Image Sharing
Introduction to research data management

Recently uploaded (20)

PDF
Computing-Curriculum for Schools in Ghana
PDF
Trump Administration's workforce development strategy
PDF
What if we spent less time fighting change, and more time building what’s rig...
PDF
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
PPTX
202450812 BayCHI UCSC-SV 20250812 v17.pptx
PDF
A systematic review of self-coping strategies used by university students to ...
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PPTX
master seminar digital applications in india
PPTX
Lesson notes of climatology university.
PPTX
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE
PDF
Anesthesia in Laparoscopic Surgery in India
PDF
RMMM.pdf make it easy to upload and study
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PPTX
Introduction-to-Literarature-and-Literary-Studies-week-Prelim-coverage.pptx
PPTX
History, Philosophy and sociology of education (1).pptx
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PDF
Practical Manual AGRO-233 Principles and Practices of Natural Farming
PDF
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
Computing-Curriculum for Schools in Ghana
Trump Administration's workforce development strategy
What if we spent less time fighting change, and more time building what’s rig...
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
202450812 BayCHI UCSC-SV 20250812 v17.pptx
A systematic review of self-coping strategies used by university students to ...
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
master seminar digital applications in india
Lesson notes of climatology university.
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE
Anesthesia in Laparoscopic Surgery in India
RMMM.pdf make it easy to upload and study
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
Introduction-to-Literarature-and-Literary-Studies-week-Prelim-coverage.pptx
History, Philosophy and sociology of education (1).pptx
Final Presentation General Medicine 03-08-2024.pptx
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
Practical Manual AGRO-233 Principles and Practices of Natural Farming
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf

Web Archiving at the NLI

  • 3. The National Library of Ireland • Special Collections • Published Collections • Digital Collections • Development Office • Education, Learning and Programming • Estates • Administration
  • 4. Digital Collections at the NLI • Digitsation • Born Digital Pilots • Web Archive • IT infrastructure • Digital Preservation
  • 5. From International Internet Preservation Consortium (IIPC) Web archiving is the process of collecting portions of the World Wide Web, preserving the collections in an archival format, and then serving the archives for access and use. What is Web Archiving?
  • 6. What Web Archiving is Not….. • Search engine indexing • Bookmarking • Cataloguing a live website • Downloading or saving a file or page • Screen recording • Screen shots
  • 7. Why do we archive the web? • Changes quickly • An information resource • Rich documentation of culture • Accountability • Our mission The mission of the Library is to collect, preserve, promote and make accessible the documentary and intellectual record of the life of Ireland.
  • 8. Web Archiving at the NLI • Thematic & Selective • Event Based • Rapid Response • Technical Partner • Legal & Technical Restrictions • Collaborative (where possible) An Overview
  • 9. Selection How do we decide what to collect? • Hard choices to make • Is it at risk? • Does it fit with the NLI CDP? • Can it be archived? • Who owns the website? • Available resources? • Expectation of inclusion?
  • 10. Themes, Events & Topics What kind of Collections do we have? • Politics • All Referendums & Elections • Society • Trade Union • Rural Life • Cultural & Creative • Irish Language • Literature
  • 11. Web Archiving- The Process Access Crawling Notification Selection Quality Assurance
  • 13. General Election 2020 The “Big” Event • 14th January • Saturday 8th February • Fianna Fáil- 38 seats • Sinn Féin-37 seats • Fine Gael- 35 seats • June 2020- Formation of the 33rd Dáil
  • 14. General Election 2020 Resulting Collection • All Political Parties • A sample of candidate websites • Retiring TDs • Outgoing Cabinet • Media & Commentary • Representative & Advocacy Groups
  • 15. From the journal.ie Feb 11th, General Election Collection What we thought was the biggest news story of the year
  • 16. Collecting in a Pandemic • Collecting begins in February • March 12th- The NLI closes • Web Archive switches to Working From Home • New approach needed
  • 17. Initial Response • Identify primary online information sources • Rapidly changing • What are the most important websites to capture? • How frequently do we archive?
  • 18. First Steps: Frequent Archiving • Health Service Executive (HSE) • Health Protection Surveillance Centre (HPSC) • Gov.ie • The Journal.ie • Tuairisc • RTÉ News 1. Weekly 2. Fortnightly 3. Each phase Archived
  • 19. Resulting Collection • Health Sector • Government Bodies • Education • Hospitality & Tourism • Charities Almost 3TB of data
  • 20. Uses for the Web Archive • Amateur researchers • New opportunities • Cross-disciplinary • Big data • New tools & methods