SlideShare a Scribd company logo
Collection Development for
Selective Web Archiving
Nicholas Taylor (@nullhandle)
Web Archiving Service Manager
Stanford University Libraries
Archive-It Partner Meeting
August 2, 2016
the Web is on fire
β€œForest wildfire” by Project LM under CC BY-NC-ND 2.0
what are we going to save?
β€œ20130809-FS-LSC-0607” by U.S. Department of Agriculture under CC BY 2.0
an area of perceived progress
0.00%
10.00%
20.00%
30.00%
40.00%
50.00%
60.00%
70.00%
NDSA: β€œ2015 NDSA Web Archiving Survey”
growth in archiving own content
0.00%
10.00%
20.00%
30.00%
40.00%
50.00%
60.00%
70.00%
80.00%
90.00%
100.00%
Own content Third-party content Both
2011 2013 2015
NDSA: β€œ2015 NDSA Web Archiving Survey”
β€œThe Cost of Poor URL Design” by Frank Farm under CC BY-NC-ND 2.0
traditional + web content collecting
subject expertise
Wordle: β€œPeople | Stanford University Libraries”
necessary but not sufficient
β€œIn principle, the collection development policy for the
Tamiment Library’s Web Archive parallels that of the
Tamiment Library as a whole (labor and radicalism)”
In practice, this is complicated by (a) the enormous size
and variety of born digital materials within Tamiment’s
collecting scope…and (c) resource restraints. Thus the
Library will not only have to carefully appraise materials,
but to set priorities and limitations.”
Tamiment Library: β€œWeb Archiving Collecting Policy”
focus on at-risk content
β€œPrecarious” by Paul Sableman under CC BY 2.0
complement collecting strengths
β€œSymbiosis” by John Spaderuiz under CC BY 2.0
observe resource constraints
β€œAbandoned” by Daniel D'Auria under CC BY-SA 2.0
consider what others are collecting
β€œ2009 san diego comic-con: comics, still an elemental part of the con” by george ruiz under CC BY 2.0
consider others’ access restrictions
β€œGarden Wall” by yuan2003 under CC BY-NC 2.0
assess value to researchers
Archive-It: β€œWANE Example Use Cases”
enable specific research
β€œMarine Le Pen 2017β€œ
β€œAlain JuppΓ© pour la Franceβ€œ
β€œJean-Luc MΓ©lenchon | Le blogβ€œ
https://www.sarkozy.fr/
consider appropriate archiving tool
β€œFruit Picker” by Naoto Sato under CC BY-NC-SA 2.0
save content, not links
β€œSigns!” by Brian Rawson-Ketchum under CC BY-SA 2.0
prefer current, esoteric content
β€œHow Much of the Web Is Archived?” by Ainsworth, AlSum, SalahEldeen, Weigle, and Nelson (2011).
79%
68%
16%
19%
support community self-archiving
β€œtraveling Pantry community workshop” by Dan Thompson under CC BY-NC-ND 2.0
together we can preserve the Web
β€œCathedral Grove” by Sang Trinh under CC BY 2.0

More Related Content

PPTX
Building Web Archiving Technology, Together
PDF
Assembling your Web 2.0 toolbox
PPTX
Outreach and learning communities at British Library Digital Research: what w...
PDF
Future Libraries: considering 'publishing', City University, London, 10 April...
PPT
Social Software and the Day School Librarian
PPT
Social Software and the Day School Librarian
DOCX
Library 2.0 Handout
DOCX
Ltr 2 Handout
Building Web Archiving Technology, Together
Assembling your Web 2.0 toolbox
Outreach and learning communities at British Library Digital Research: what w...
Future Libraries: considering 'publishing', City University, London, 10 April...
Social Software and the Day School Librarian
Social Software and the Day School Librarian
Library 2.0 Handout
Ltr 2 Handout

Similar to Collection Development for Selective Web Archiving (20)

PPTX
Considerations for Strategic Web Archive Collection Development
PPTX
Boiling the Ocean, Together: Web Archive Collection Development in a Global C...
PPTX
2015 NDSA Web Archiving Survey Report Highlights
PPTX
A Snapshot of the U.S. Web Archiving Landscape through the 2013 NDSA Survey R...
PPTX
2013 NDSA Web Archiving Survey Report Highlights
PPT
Creating and Maintaining Web Archives
PPT
Kris Carpenter Negulescu Gordon Paynter Archiving the National Web of New Zea...
PDF
Introduction to Web Archiving
PPTX
SAA 2015 Web Archiving Roundtable
PPTX
Capture All the URLs: First Steps in Web Archiving
PPTX
Capture All the URLS: First Steps in Web Archiving
PPTX
Progress Made and Lessons Learned through Collaborative Web Archiving Proj...
PPTX
From Seed to Harvest: Web Archiving Program Considerations for SUL
PDF
Farl web archiving
Β 
PPTX
Preserving the web
PPTX
Making the Black Hole Gray: Implementing the Web Archiving of Specialist Art ...
PDF
The web is a mess: how I learnt to stop worrying and love web archiving. Kris...
PPTX
Web archiving challenges and opportunities
PPTX
DPC Web Archiving & Preservation Webinar #4: Outreach & Awareness Raising
PDF
Web Archiving Workflow
Considerations for Strategic Web Archive Collection Development
Boiling the Ocean, Together: Web Archive Collection Development in a Global C...
2015 NDSA Web Archiving Survey Report Highlights
A Snapshot of the U.S. Web Archiving Landscape through the 2013 NDSA Survey R...
2013 NDSA Web Archiving Survey Report Highlights
Creating and Maintaining Web Archives
Kris Carpenter Negulescu Gordon Paynter Archiving the National Web of New Zea...
Introduction to Web Archiving
SAA 2015 Web Archiving Roundtable
Capture All the URLs: First Steps in Web Archiving
Capture All the URLS: First Steps in Web Archiving
Progress Made and Lessons Learned through Collaborative Web Archiving Proj...
From Seed to Harvest: Web Archiving Program Considerations for SUL
Farl web archiving
Β 
Preserving the web
Making the Black Hole Gray: Implementing the Web Archiving of Specialist Art ...
The web is a mess: how I learnt to stop worrying and love web archiving. Kris...
Web archiving challenges and opportunities
DPC Web Archiving & Preservation Webinar #4: Outreach & Awareness Raising
Web Archiving Workflow
Ad

More from nullhandle (20)

PPTX
Understanding Legal Use Cases for Web Archives
PPTX
Lots More LOCKSS for Web Archiving: Boons from the LOCKSS Software Re-Archite...
PPTX
Unlocking LOCKSS with APIs
PPTX
Lots of LOCKSS Keeping Stuff Safe: The Future of the LOCKSS Program
PPTX
Interoperability and Technical Collaboration for Web and Social Media Archiving
PPTX
Rethinking Web Archiving Quality Assurance for Impact, Scalability, and Susta...
PPTX
Why Not Lots of Copies Keep(ing) Software Safe?
PPTX
WASAPI Web Archive Data Transfer APIs
PPTX
Outreach to Campus Webmasters for a Better Web, and Better Web Archiving
PPTX
Measure All the (Web Archiving) Things!
PPTX
Campaign Web Archives to Support Multi-Institutional Research
PPTX
Advocating for Web Archivability
PPTX
Building Archivable Websites
PPTX
Link Persistence, Website Persistence
PPTX
A Survey of Research Prospects for more Manageable Personal Digital Photo Col...
PPT
Tool Academy: Web Archiving
PPT
Using Wayback Machine for Research
PPT
Designing Preservable Websites
PPT
Web and Twitter Archiving at the Library of Congress
PPT
Where We're Going: Non-Traditional Careers for LIS Graduates
Understanding Legal Use Cases for Web Archives
Lots More LOCKSS for Web Archiving: Boons from the LOCKSS Software Re-Archite...
Unlocking LOCKSS with APIs
Lots of LOCKSS Keeping Stuff Safe: The Future of the LOCKSS Program
Interoperability and Technical Collaboration for Web and Social Media Archiving
Rethinking Web Archiving Quality Assurance for Impact, Scalability, and Susta...
Why Not Lots of Copies Keep(ing) Software Safe?
WASAPI Web Archive Data Transfer APIs
Outreach to Campus Webmasters for a Better Web, and Better Web Archiving
Measure All the (Web Archiving) Things!
Campaign Web Archives to Support Multi-Institutional Research
Advocating for Web Archivability
Building Archivable Websites
Link Persistence, Website Persistence
A Survey of Research Prospects for more Manageable Personal Digital Photo Col...
Tool Academy: Web Archiving
Using Wayback Machine for Research
Designing Preservable Websites
Web and Twitter Archiving at the Library of Congress
Where We're Going: Non-Traditional Careers for LIS Graduates
Ad

Recently uploaded (20)

PPTX
innovation process that make everything different.pptx
PDF
Slides PDF The World Game (s) Eco Economic Epochs.pdf
PDF
SASE Traffic Flow - ZTNA Connector-1.pdf
PDF
FINAL CALL-6th International Conference on Networks & IOT (NeTIOT 2025)
PDF
WebRTC in SignalWire - troubleshooting media negotiation
PDF
Sims 4 Historia para lo sims 4 para jugar
PDF
Cloud-Scale Log Monitoring _ Datadog.pdf
PPTX
introduction about ICD -10 & ICD-11 ppt.pptx
PPTX
artificialintelligenceai1-copy-210604123353.pptx
Β 
PDF
Tenda Login Guide: Access Your Router in 5 Easy Steps
PPT
tcp ip networks nd ip layering assotred slides
PPT
FIRE PREVENTION AND CONTROL PLAN- LUS.FM.MQ.OM.UTM.PLN.00014.ppt
PPTX
Power Point - Lesson 3_2.pptx grad school presentation
Β 
PDF
Decoding a Decade: 10 Years of Applied CTI Discipline
PDF
πŸ’° π”πŠπ“πˆ πŠπ„πŒπ„ππ€ππ†π€π πŠπˆππ„π‘πŸ’πƒ π‡π€π‘πˆ 𝐈𝐍𝐈 πŸπŸŽπŸπŸ“ πŸ’°
Β 
PPT
isotopes_sddsadsaadasdasdasdasdsa1213.ppt
PPTX
SAP Ariba Sourcing PPT for learning material
PPTX
presentation_pfe-universite-molay-seltan.pptx
PPTX
522797556-Unit-2-Temperature-measurement-1-1.pptx
PPTX
Introuction about WHO-FIC in ICD-10.pptx
innovation process that make everything different.pptx
Slides PDF The World Game (s) Eco Economic Epochs.pdf
SASE Traffic Flow - ZTNA Connector-1.pdf
FINAL CALL-6th International Conference on Networks & IOT (NeTIOT 2025)
WebRTC in SignalWire - troubleshooting media negotiation
Sims 4 Historia para lo sims 4 para jugar
Cloud-Scale Log Monitoring _ Datadog.pdf
introduction about ICD -10 & ICD-11 ppt.pptx
artificialintelligenceai1-copy-210604123353.pptx
Β 
Tenda Login Guide: Access Your Router in 5 Easy Steps
tcp ip networks nd ip layering assotred slides
FIRE PREVENTION AND CONTROL PLAN- LUS.FM.MQ.OM.UTM.PLN.00014.ppt
Power Point - Lesson 3_2.pptx grad school presentation
Β 
Decoding a Decade: 10 Years of Applied CTI Discipline
πŸ’° π”πŠπ“πˆ πŠπ„πŒπ„ππ€ππ†π€π πŠπˆππ„π‘πŸ’πƒ π‡π€π‘πˆ 𝐈𝐍𝐈 πŸπŸŽπŸπŸ“ πŸ’°
Β 
isotopes_sddsadsaadasdasdasdasdsa1213.ppt
SAP Ariba Sourcing PPT for learning material
presentation_pfe-universite-molay-seltan.pptx
522797556-Unit-2-Temperature-measurement-1-1.pptx
Introuction about WHO-FIC in ICD-10.pptx

Collection Development for Selective Web Archiving

  • 1. Collection Development for Selective Web Archiving Nicholas Taylor (@nullhandle) Web Archiving Service Manager Stanford University Libraries Archive-It Partner Meeting August 2, 2016
  • 2. the Web is on fire β€œForest wildfire” by Project LM under CC BY-NC-ND 2.0
  • 3. what are we going to save? β€œ20130809-FS-LSC-0607” by U.S. Department of Agriculture under CC BY 2.0
  • 4. an area of perceived progress 0.00% 10.00% 20.00% 30.00% 40.00% 50.00% 60.00% 70.00% NDSA: β€œ2015 NDSA Web Archiving Survey”
  • 5. growth in archiving own content 0.00% 10.00% 20.00% 30.00% 40.00% 50.00% 60.00% 70.00% 80.00% 90.00% 100.00% Own content Third-party content Both 2011 2013 2015 NDSA: β€œ2015 NDSA Web Archiving Survey”
  • 6. β€œThe Cost of Poor URL Design” by Frank Farm under CC BY-NC-ND 2.0 traditional + web content collecting
  • 7. subject expertise Wordle: β€œPeople | Stanford University Libraries”
  • 8. necessary but not sufficient β€œIn principle, the collection development policy for the Tamiment Library’s Web Archive parallels that of the Tamiment Library as a whole (labor and radicalism)” In practice, this is complicated by (a) the enormous size and variety of born digital materials within Tamiment’s collecting scope…and (c) resource restraints. Thus the Library will not only have to carefully appraise materials, but to set priorities and limitations.” Tamiment Library: β€œWeb Archiving Collecting Policy”
  • 9. focus on at-risk content β€œPrecarious” by Paul Sableman under CC BY 2.0
  • 10. complement collecting strengths β€œSymbiosis” by John Spaderuiz under CC BY 2.0
  • 11. observe resource constraints β€œAbandoned” by Daniel D'Auria under CC BY-SA 2.0
  • 12. consider what others are collecting β€œ2009 san diego comic-con: comics, still an elemental part of the con” by george ruiz under CC BY 2.0
  • 13. consider others’ access restrictions β€œGarden Wall” by yuan2003 under CC BY-NC 2.0
  • 14. assess value to researchers Archive-It: β€œWANE Example Use Cases”
  • 15. enable specific research β€œMarine Le Pen 2017β€œ β€œAlain JuppΓ© pour la Franceβ€œ β€œJean-Luc MΓ©lenchon | Le blogβ€œ https://www.sarkozy.fr/
  • 16. consider appropriate archiving tool β€œFruit Picker” by Naoto Sato under CC BY-NC-SA 2.0
  • 17. save content, not links β€œSigns!” by Brian Rawson-Ketchum under CC BY-SA 2.0
  • 18. prefer current, esoteric content β€œHow Much of the Web Is Archived?” by Ainsworth, AlSum, SalahEldeen, Weigle, and Nelson (2011). 79% 68% 16% 19%
  • 19. support community self-archiving β€œtraveling Pantry community workshop” by Dan Thompson under CC BY-NC-ND 2.0
  • 20. together we can preserve the Web β€œCathedral Grove” by Sang Trinh under CC BY 2.0