SlideShare a Scribd company logo
Content Working Group
2013 NDSA Web Archiving
Survey Report Highlights
Nicholas Taylor (@nullhandle)
Web Archiving Service Manager
Stanford University Libraries
SAA Annual Meeting: Web Archiving Roundtable
August 13, 2014
Content Working Group
NDSA Web Archiving Survey Working Group
Jefferson Bailey
Internet Archive / Archive-It
Kristine Hanna
Internet Archive / Archive-It
Edward McCain
University of Missouri
Cathy Hartman
University of North Texas
Abbie Grotke
Library of Congress
Christie Moffatt
National Library of Medicine
Nicholas Taylor
Stanford University
Content Working Group
NDSA Web Archiving survey background
2011
β€’ 78 respondents
β€’ program info
β€’ tools and services
β€’ access
β€’ policies
2013
β€’ 92 respondents
β€’ program info
β€’ staff time, metrics, skills,
content concerns
β€’ tools and services
β€’ access and discovery
β€’ new discovery options
β€’ policies
β€’ embargo, social media,
robots.txt, resources
Content Working Group
Respondent Characteristics
β€œLego People” by Scoobay under CC BY-NC-SA 2.0
Content Working Group
universities still make up most programs
College or
University
47%
Archive
13%
State Gov
13%
Other
12%
Fed Gov
8%
Commercial
2%
Public
Library
2%
Museum
3%
2011
College or
University
52%
Archive
15%
State Gov
13%
Other
8%
Fed Gov
5%
Commercial
4%
Public
Library
2%
Museum
1%
2013
Content Working Group
SAA WebArch RT tops group affiliations
group 2011 2013
8% 7%
31% 33%
45%
Content Working Group
most programs are fractionally staffed
less than 25% FTE
25% FTE
40-50% FTE
1 FTE
1 to 3 FTE
3.5 to 15 FTE
Content Working Group
Maturity and Progress
β€œApple Mouse Evolution” by raneko under CC BY 2.0
Content Working Group
programs have matured slightly since 2011
64%
16% 17%
4%
72%
14%
9%
2%
0%
10%
20%
30%
40%
50%
60%
70%
80%
Active Testing Planning No longer collecting
2011 2013
Content Working Group
strong perceptions of progress since 2011
Significant progress
40%
Some progress
36%
About the same
20%
Slightly worse off
2%
Much worse off
2%
Content Working Group
many new programs since 2011
1
0
3
0
2
1
2
0
2
3
8
6
5
4
6
7
12
19
0
2
4
6
8
10
12
14
16
18
20
1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013
Number of organizations
Content Working Group
Archiving Focus
β€œAnt Farm Media Van v.08 (Time Capsule) in Bellewether at Southern Exposure” by Steve Rhodes under CC BY-NC-SA 2.0
Content Working Group
more programs are only self-archiving
31%
49%
20%
15%
48%
37%
0%
10%
20%
30%
40%
50%
60%
Archive other sites only Archive both Archive own site only
2011 2013
Content Working Group
concern about social media, databases, video
69
65 64
49
40
32
16
0
10
20
30
40
50
60
70
80
Social Media Databases Video Interactive
Media
Audio Blogs Art
Number of organizations
Content Working Group
untapped interest in collaboration
21%
72%
7%
17%
47%
33%
2%
0%
10%
20%
30%
40%
50%
60%
70%
80%
Yes No Not yet, but interested Don't know
2011 2013
Content Working Group
β€œPhotocopier” by Joriel "Joz" Jimenez under CC BY-NC-ND 2.0
Tools and Services
Content Working Group
web archiving as a service still most popular
60%
25%
14%
63%
20%
16%
0%
10%
20%
30%
40%
50%
60%
70%
External In-house Both
2011 2013
Content Working Group
data not transferred from service provider
19%
81%
20%
80%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
Transferred Haven't transferred
2011 2013
Content Working Group
increased use of tools supporting W/ARC
24%
76%
38%
62%
0%
10%
20%
30%
40%
50%
60%
70%
80%
Supports W/ARC Doesn't support W/ARC
2011 2013
Content Working Group
less granular descriptive metadata
62% 66%
47%
55%
30%
36%
54%
60%
43%
50%
22%
18% 20%
5%
20%
0%
10%
20%
30%
40%
50%
60%
70%
2011 2013
Content Working Group
Archiving Policies
β€œHandle With Care” by ServInt under CC BY-NC-ND 2.0
Content Working Group
most don’t notify or seek permission
42 42
45
17
7
11
14 13
15
0
5
10
15
20
25
30
35
40
45
50
Capture Provide restricted access Provide public access
No action Notify Request permission
Content Working Group
more conditional handling of robots.txt
38%
33%
8%
21%22%
55%
8%
16%
0%
10%
20%
30%
40%
50%
60%
Always respect robots.txt Sometimes/conditionally
respect robots.txt
Never respect robots.txt Don't know
2011 2013
Content Working Group
social media archiving policies are uncommon
Has social media
archiving policy
24%
Lacks social media
archiving policy
76%
Content Working Group
policies based on community practices
Other organizations
36%
ARL Code of Best
Practices
27%
Section 108
Study Group
17%
Counsel or service
provider
7%
Oakland Archive Policy
4%
Statute
4%
Don't know
5%
Content Working Group
takeaways and questions for SAA WebArch RT
β€’ for individual organizations:
β€’ if you’re only self-archiving, what’s on your roadmap?
β€’ how are you preserving your web archive data?
β€’ how do you describe and enable discovery of web archives?
β€’ how do you handle robots.txt?
β€’ what are your plans for social media archiving policy?
β€’ for the group:
β€’ what is this group (vs. IIPC, NDSA) best equipped to do?
β€’ what kind of collaboration are you interested in?
Content Working Group
Nicholas Taylor
@nullhandle
β€œThank You” by vistamommy under CC BY 2.0

More Related Content

PPTX
A Snapshot of the U.S. Web Archiving Landscape through the 2013 NDSA Survey R...
PPT
Spotlight on the Digital: increase discovery of your digital resources
PPTX
Slides digital+scholarship+survey+2014
PPT
CDC Health Communication
PPT
A Challenge to Web Accessibility Metrics and Guidelines: Putting People and P...
PPTX
How you can enhance the efficiency and effectiveness of teaching and learning...
Β 
PPTX
Better Software, Better Practices, Better Research
PPTX
The evolution of FELTAG - Jisc Digifest 2016
Β 
A Snapshot of the U.S. Web Archiving Landscape through the 2013 NDSA Survey R...
Spotlight on the Digital: increase discovery of your digital resources
Slides digital+scholarship+survey+2014
CDC Health Communication
A Challenge to Web Accessibility Metrics and Guidelines: Putting People and P...
How you can enhance the efficiency and effectiveness of teaching and learning...
Β 
Better Software, Better Practices, Better Research
The evolution of FELTAG - Jisc Digifest 2016
Β 

What's hot (20)

PPT
What researchers want, and how to pay for it...
PPT
What Researchers Want, and How to Pay for It by Michael Jubb, Research Inform...
PPTX
Access to and Use of Electronic Resources in SDL (Saudi Digital Library) by...
PPTX
Leveraging change through digital capability - Lawrie Phipps, Terri Smith and...
Β 
PPTX
Equipping the researcher - patterns in the UK and US
Β 
PPTX
Scaling upon online learning project update_22.04.15
PDF
Digital Dinosaurs: MOOCs and Digital Strategies at the University of Alberta
PPTX
Keynote speech - Carole Goble - Jisc Digital Festival 2015
Β 
PPT
Teaching internet research skills - new directions for the Intute: Virtual Tr...
PPTX
University of Alberta Web Experience Survey Results
PDF
The Role of Digital Strategy in External Portfolios
PPTX
Find out about Jisc - Networkshop44 2016
Β 
PPTX
Leveraging data driven decision making to drive student success, retention, a...
PPTX
Using an Availability Study to Assess Access to Electronic Articles
PPTX
Student experience experts meeting
Β 
PDF
Research data spring: streamlining deposit
PDF
Infographic: Awareness of OER and OEP in Colleges in Scotland
PPTX
Understanding the Big Data Enterprise
PPT
Collaborative Opportunities In Erm
PPTX
Privacy Concerns in Sharing Personal Consumption Data through Online Applicat...
What researchers want, and how to pay for it...
What Researchers Want, and How to Pay for It by Michael Jubb, Research Inform...
Access to and Use of Electronic Resources in SDL (Saudi Digital Library) by...
Leveraging change through digital capability - Lawrie Phipps, Terri Smith and...
Β 
Equipping the researcher - patterns in the UK and US
Β 
Scaling upon online learning project update_22.04.15
Digital Dinosaurs: MOOCs and Digital Strategies at the University of Alberta
Keynote speech - Carole Goble - Jisc Digital Festival 2015
Β 
Teaching internet research skills - new directions for the Intute: Virtual Tr...
University of Alberta Web Experience Survey Results
The Role of Digital Strategy in External Portfolios
Find out about Jisc - Networkshop44 2016
Β 
Leveraging data driven decision making to drive student success, retention, a...
Using an Availability Study to Assess Access to Electronic Articles
Student experience experts meeting
Β 
Research data spring: streamlining deposit
Infographic: Awareness of OER and OEP in Colleges in Scotland
Understanding the Big Data Enterprise
Collaborative Opportunities In Erm
Privacy Concerns in Sharing Personal Consumption Data through Online Applicat...
Ad

Similar to 2013 NDSA Web Archiving Survey Report Highlights (20)

PPTX
2015 NDSA Web Archiving Survey Report Highlights
PPTX
Advancing the National Digital Platform: Survey Findings
Β 
PPTX
Advancing the National Digital Platform: Survey Findings
Β 
PDF
REDEFINING ASSUMPTIONS Accessibility and Its Stakeholders
PDF
Content Curation – New L&D Mindset & Skill Set
PPTX
GEOSS Data CORE and GEOSS Common Infrastructure: awareness, involvement, and ...
PPTX
Social media and researchers: Josipa Crnic Deakin University
PPTX
There is a method to it: Making meaning in information research through a mix...
PPTX
Collaborative Data Mark-up & Distribution
PPT
CDC National Conference on Health Communication, Marketing and Media 2010
PDF
Use academic research_by_public_sector_ cherney_gl_conf_2013
PDF
Exploring Online Qualitative Research: Tools and Techniques
PPTX
Brantford pl (2)
PPTX
Lee Rainie
PPTX
Bringing Cities, Libraries and Citizens Together through Open Data Hackathons
PPT
Royal ut pres
PDF
Marc hoit University Campus - Microcosm of the future
PPTX
OCLC Research Update at ALA Chicago. June 26, 2017.
Β 
PPTX
Slides digital scholarship survey 2014
2015 NDSA Web Archiving Survey Report Highlights
Advancing the National Digital Platform: Survey Findings
Β 
Advancing the National Digital Platform: Survey Findings
Β 
REDEFINING ASSUMPTIONS Accessibility and Its Stakeholders
Content Curation – New L&D Mindset & Skill Set
GEOSS Data CORE and GEOSS Common Infrastructure: awareness, involvement, and ...
Social media and researchers: Josipa Crnic Deakin University
There is a method to it: Making meaning in information research through a mix...
Collaborative Data Mark-up & Distribution
CDC National Conference on Health Communication, Marketing and Media 2010
Use academic research_by_public_sector_ cherney_gl_conf_2013
Exploring Online Qualitative Research: Tools and Techniques
Brantford pl (2)
Lee Rainie
Bringing Cities, Libraries and Citizens Together through Open Data Hackathons
Royal ut pres
Marc hoit University Campus - Microcosm of the future
OCLC Research Update at ALA Chicago. June 26, 2017.
Β 
Slides digital scholarship survey 2014
Ad

More from nullhandle (20)

PPTX
Understanding Legal Use Cases for Web Archives
PPTX
Lots More LOCKSS for Web Archiving: Boons from the LOCKSS Software Re-Archite...
PPTX
Unlocking LOCKSS with APIs
PPTX
Lots of LOCKSS Keeping Stuff Safe: The Future of the LOCKSS Program
PPTX
Interoperability and Technical Collaboration for Web and Social Media Archiving
PPTX
Rethinking Web Archiving Quality Assurance for Impact, Scalability, and Susta...
PPTX
Collection Development for Selective Web Archiving
PPTX
Why Not Lots of Copies Keep(ing) Software Safe?
PPTX
WASAPI Web Archive Data Transfer APIs
PPTX
Building Web Archiving Technology, Together
PPTX
Outreach to Campus Webmasters for a Better Web, and Better Web Archiving
PPTX
Measure All the (Web Archiving) Things!
PPTX
Campaign Web Archives to Support Multi-Institutional Research
PPTX
Considerations for Strategic Web Archive Collection Development
PPTX
Boiling the Ocean, Together: Web Archive Collection Development in a Global C...
PPTX
Advocating for Web Archivability
PPTX
Building Archivable Websites
PPTX
Link Persistence, Website Persistence
PPTX
From Seed to Harvest: Web Archiving Program Considerations for SUL
PPTX
A Survey of Research Prospects for more Manageable Personal Digital Photo Col...
Understanding Legal Use Cases for Web Archives
Lots More LOCKSS for Web Archiving: Boons from the LOCKSS Software Re-Archite...
Unlocking LOCKSS with APIs
Lots of LOCKSS Keeping Stuff Safe: The Future of the LOCKSS Program
Interoperability and Technical Collaboration for Web and Social Media Archiving
Rethinking Web Archiving Quality Assurance for Impact, Scalability, and Susta...
Collection Development for Selective Web Archiving
Why Not Lots of Copies Keep(ing) Software Safe?
WASAPI Web Archive Data Transfer APIs
Building Web Archiving Technology, Together
Outreach to Campus Webmasters for a Better Web, and Better Web Archiving
Measure All the (Web Archiving) Things!
Campaign Web Archives to Support Multi-Institutional Research
Considerations for Strategic Web Archive Collection Development
Boiling the Ocean, Together: Web Archive Collection Development in a Global C...
Advocating for Web Archivability
Building Archivable Websites
Link Persistence, Website Persistence
From Seed to Harvest: Web Archiving Program Considerations for SUL
A Survey of Research Prospects for more Manageable Personal Digital Photo Col...

Recently uploaded (20)

PPT
415456121-Jiwratrwecdtwfdsfwgdwedvwe dbwsdjsadca-EVN.ppt
PPT
FIRE PREVENTION AND CONTROL PLAN- LUS.FM.MQ.OM.UTM.PLN.00014.ppt
PPTX
SAP Ariba Sourcing PPT for learning material
PPTX
t_and_OpenAI_Combined_two_pressentations
PPTX
newyork.pptxirantrafgshenepalchinachinane
PDF
The New Creative Director: How AI Tools for Social Media Content Creation Are...
PPTX
Digital Literacy And Online Safety on internet
PDF
Vigrab.top – Online Tool for Downloading and Converting Social Media Videos a...
PDF
Sims 4 Historia para lo sims 4 para jugar
PDF
πŸ’° π”πŠπ“πˆ πŠπ„πŒπ„ππ€ππ†π€π πŠπˆππ„π‘πŸ’πƒ π‡π€π‘πˆ 𝐈𝐍𝐈 πŸπŸŽπŸπŸ“ πŸ’°
Β 
PPTX
Database Information System - Management Information System
PPT
250152213-Excitation-SystemWERRT (1).ppt
PPT
isotopes_sddsadsaadasdasdasdasdsa1213.ppt
PPTX
Module 1 - Cyber Law and Ethics 101.pptx
PDF
FINAL CALL-6th International Conference on Networks & IOT (NeTIOT 2025)
PPTX
Mathew Digital SEO Checklist Guidlines 2025
PPTX
artificial intelligence overview of it and more
DOCX
Unit-3 cyber security network security of internet system
PDF
Exploring VPS Hosting Trends for SMBs in 2025
PDF
Introduction to the IoT system, how the IoT system works
415456121-Jiwratrwecdtwfdsfwgdwedvwe dbwsdjsadca-EVN.ppt
FIRE PREVENTION AND CONTROL PLAN- LUS.FM.MQ.OM.UTM.PLN.00014.ppt
SAP Ariba Sourcing PPT for learning material
t_and_OpenAI_Combined_two_pressentations
newyork.pptxirantrafgshenepalchinachinane
The New Creative Director: How AI Tools for Social Media Content Creation Are...
Digital Literacy And Online Safety on internet
Vigrab.top – Online Tool for Downloading and Converting Social Media Videos a...
Sims 4 Historia para lo sims 4 para jugar
πŸ’° π”πŠπ“πˆ πŠπ„πŒπ„ππ€ππ†π€π πŠπˆππ„π‘πŸ’πƒ π‡π€π‘πˆ 𝐈𝐍𝐈 πŸπŸŽπŸπŸ“ πŸ’°
Β 
Database Information System - Management Information System
250152213-Excitation-SystemWERRT (1).ppt
isotopes_sddsadsaadasdasdasdasdsa1213.ppt
Module 1 - Cyber Law and Ethics 101.pptx
FINAL CALL-6th International Conference on Networks & IOT (NeTIOT 2025)
Mathew Digital SEO Checklist Guidlines 2025
artificial intelligence overview of it and more
Unit-3 cyber security network security of internet system
Exploring VPS Hosting Trends for SMBs in 2025
Introduction to the IoT system, how the IoT system works

2013 NDSA Web Archiving Survey Report Highlights

  • 1. Content Working Group 2013 NDSA Web Archiving Survey Report Highlights Nicholas Taylor (@nullhandle) Web Archiving Service Manager Stanford University Libraries SAA Annual Meeting: Web Archiving Roundtable August 13, 2014
  • 2. Content Working Group NDSA Web Archiving Survey Working Group Jefferson Bailey Internet Archive / Archive-It Kristine Hanna Internet Archive / Archive-It Edward McCain University of Missouri Cathy Hartman University of North Texas Abbie Grotke Library of Congress Christie Moffatt National Library of Medicine Nicholas Taylor Stanford University
  • 3. Content Working Group NDSA Web Archiving survey background 2011 β€’ 78 respondents β€’ program info β€’ tools and services β€’ access β€’ policies 2013 β€’ 92 respondents β€’ program info β€’ staff time, metrics, skills, content concerns β€’ tools and services β€’ access and discovery β€’ new discovery options β€’ policies β€’ embargo, social media, robots.txt, resources
  • 4. Content Working Group Respondent Characteristics β€œLego People” by Scoobay under CC BY-NC-SA 2.0
  • 5. Content Working Group universities still make up most programs College or University 47% Archive 13% State Gov 13% Other 12% Fed Gov 8% Commercial 2% Public Library 2% Museum 3% 2011 College or University 52% Archive 15% State Gov 13% Other 8% Fed Gov 5% Commercial 4% Public Library 2% Museum 1% 2013
  • 6. Content Working Group SAA WebArch RT tops group affiliations group 2011 2013 8% 7% 31% 33% 45%
  • 7. Content Working Group most programs are fractionally staffed less than 25% FTE 25% FTE 40-50% FTE 1 FTE 1 to 3 FTE 3.5 to 15 FTE
  • 8. Content Working Group Maturity and Progress β€œApple Mouse Evolution” by raneko under CC BY 2.0
  • 9. Content Working Group programs have matured slightly since 2011 64% 16% 17% 4% 72% 14% 9% 2% 0% 10% 20% 30% 40% 50% 60% 70% 80% Active Testing Planning No longer collecting 2011 2013
  • 10. Content Working Group strong perceptions of progress since 2011 Significant progress 40% Some progress 36% About the same 20% Slightly worse off 2% Much worse off 2%
  • 11. Content Working Group many new programs since 2011 1 0 3 0 2 1 2 0 2 3 8 6 5 4 6 7 12 19 0 2 4 6 8 10 12 14 16 18 20 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 Number of organizations
  • 12. Content Working Group Archiving Focus β€œAnt Farm Media Van v.08 (Time Capsule) in Bellewether at Southern Exposure” by Steve Rhodes under CC BY-NC-SA 2.0
  • 13. Content Working Group more programs are only self-archiving 31% 49% 20% 15% 48% 37% 0% 10% 20% 30% 40% 50% 60% Archive other sites only Archive both Archive own site only 2011 2013
  • 14. Content Working Group concern about social media, databases, video 69 65 64 49 40 32 16 0 10 20 30 40 50 60 70 80 Social Media Databases Video Interactive Media Audio Blogs Art Number of organizations
  • 15. Content Working Group untapped interest in collaboration 21% 72% 7% 17% 47% 33% 2% 0% 10% 20% 30% 40% 50% 60% 70% 80% Yes No Not yet, but interested Don't know 2011 2013
  • 16. Content Working Group β€œPhotocopier” by Joriel "Joz" Jimenez under CC BY-NC-ND 2.0 Tools and Services
  • 17. Content Working Group web archiving as a service still most popular 60% 25% 14% 63% 20% 16% 0% 10% 20% 30% 40% 50% 60% 70% External In-house Both 2011 2013
  • 18. Content Working Group data not transferred from service provider 19% 81% 20% 80% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% Transferred Haven't transferred 2011 2013
  • 19. Content Working Group increased use of tools supporting W/ARC 24% 76% 38% 62% 0% 10% 20% 30% 40% 50% 60% 70% 80% Supports W/ARC Doesn't support W/ARC 2011 2013
  • 20. Content Working Group less granular descriptive metadata 62% 66% 47% 55% 30% 36% 54% 60% 43% 50% 22% 18% 20% 5% 20% 0% 10% 20% 30% 40% 50% 60% 70% 2011 2013
  • 21. Content Working Group Archiving Policies β€œHandle With Care” by ServInt under CC BY-NC-ND 2.0
  • 22. Content Working Group most don’t notify or seek permission 42 42 45 17 7 11 14 13 15 0 5 10 15 20 25 30 35 40 45 50 Capture Provide restricted access Provide public access No action Notify Request permission
  • 23. Content Working Group more conditional handling of robots.txt 38% 33% 8% 21%22% 55% 8% 16% 0% 10% 20% 30% 40% 50% 60% Always respect robots.txt Sometimes/conditionally respect robots.txt Never respect robots.txt Don't know 2011 2013
  • 24. Content Working Group social media archiving policies are uncommon Has social media archiving policy 24% Lacks social media archiving policy 76%
  • 25. Content Working Group policies based on community practices Other organizations 36% ARL Code of Best Practices 27% Section 108 Study Group 17% Counsel or service provider 7% Oakland Archive Policy 4% Statute 4% Don't know 5%
  • 26. Content Working Group takeaways and questions for SAA WebArch RT β€’ for individual organizations: β€’ if you’re only self-archiving, what’s on your roadmap? β€’ how are you preserving your web archive data? β€’ how do you describe and enable discovery of web archives? β€’ how do you handle robots.txt? β€’ what are your plans for social media archiving policy? β€’ for the group: β€’ what is this group (vs. IIPC, NDSA) best equipped to do? β€’ what kind of collaboration are you interested in?
  • 27. Content Working Group Nicholas Taylor @nullhandle β€œThank You” by vistamommy under CC BY 2.0